from datetime import datetime, timedelta from airflow import DAG from airflow.operators.bash_operator import BashOperator
def extract(**context): context['ti'].xcom_push(key='user_id', value=42) return "raw": "data"
XCom rows are uniquely identified by this combination of columns in Airflow database:
In a multi-tenant environment, you might want to ensure that Task B can pull data from Task A, but Task C (perhaps a notification task) cannot. While Airflow doesn't have native "per-key" permissions, developers implement exclusivity through:
The you pass between tasks (JSON, DataFrames, File Paths)
from airflow.models.xcom import BaseXCom from airflow.exceptions import AirflowException
def process_data_func(ti): # Exclusive pull: Only fetches from 'extract_task', ignoring all other XComs raw_data = ti.xcom_pull(task_ids='extract_task', key='return_value') return f"Processed: len(raw_data) items" Use code with caution. 3. Custom XCom Backends: Exclusive Storage Offloading
To mitigate these risks, workflows require an : restricting data access only to downstream tasks that explicitly need it, and moving data payloads out of the transactional database. 2. Implementing Explicit and Exclusive Pulls
Historically, Airflow allowed XCom values to be serialized using Python's pickle module, which could lead to security vulnerabilities and version incompatibilities. Modern Airflow , and pickling support is deprecated. Always ensure your XCom values are JSON‑serializable unless you have a very good reason to do otherwise.
def extract_data(**kwargs): # logic here file_path = "/tmp/data_2023.csv" return file_path # This is automatically pushed to XCom
Airflow does not automatically delete XCom entries when a DAG run finishes. Over months, your xcom table will grow to millions of rows, slowing down database queries.
By default, Airflow uses the PickleXCom backend. This means data must be serializable (pickled).
default_args = 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2023, 3, 20), 'retries': 1, 'retry_delay': timedelta(minutes=5),
Moving data across tasks brings severe compliance and security considerations, especially under GDPR, HIPAA, or CCPA frameworks. At-Rest Encryption in Custom Backends
Scroll to top
Airflow Xcom Exclusive Online
from datetime import datetime, timedelta from airflow import DAG from airflow.operators.bash_operator import BashOperator
def extract(**context): context['ti'].xcom_push(key='user_id', value=42) return "raw": "data"
XCom rows are uniquely identified by this combination of columns in Airflow database:
In a multi-tenant environment, you might want to ensure that Task B can pull data from Task A, but Task C (perhaps a notification task) cannot. While Airflow doesn't have native "per-key" permissions, developers implement exclusivity through: airflow xcom exclusive
The you pass between tasks (JSON, DataFrames, File Paths)
from airflow.models.xcom import BaseXCom from airflow.exceptions import AirflowException
def process_data_func(ti): # Exclusive pull: Only fetches from 'extract_task', ignoring all other XComs raw_data = ti.xcom_pull(task_ids='extract_task', key='return_value') return f"Processed: len(raw_data) items" Use code with caution. 3. Custom XCom Backends: Exclusive Storage Offloading from datetime import datetime, timedelta from airflow import
To mitigate these risks, workflows require an : restricting data access only to downstream tasks that explicitly need it, and moving data payloads out of the transactional database. 2. Implementing Explicit and Exclusive Pulls
Historically, Airflow allowed XCom values to be serialized using Python's pickle module, which could lead to security vulnerabilities and version incompatibilities. Modern Airflow , and pickling support is deprecated. Always ensure your XCom values are JSON‑serializable unless you have a very good reason to do otherwise.
def extract_data(**kwargs): # logic here file_path = "/tmp/data_2023.csv" return file_path # This is automatically pushed to XCom Modern Airflow , and pickling support is deprecated
Airflow does not automatically delete XCom entries when a DAG run finishes. Over months, your xcom table will grow to millions of rows, slowing down database queries.
By default, Airflow uses the PickleXCom backend. This means data must be serializable (pickled).
default_args = 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2023, 3, 20), 'retries': 1, 'retry_delay': timedelta(minutes=5),
Moving data across tasks brings severe compliance and security considerations, especially under GDPR, HIPAA, or CCPA frameworks. At-Rest Encryption in Custom Backends