Airflow Sensor Timeout Fix
In this tutorial, you'll learn about Airflow Sensor Timeout Fix. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
A sensor task runs forever:
wait_for_file = FileSensor(
task_id="wait_for_file",
filepath="/data/input.csv",
poke_interval=60, # Check every 60 seconds
timeout=3600, # Timeout after 1 hour
)
The sensor times out after 1 hour and the task fails, but it appeared to be "running" the entire time. Sensors use poke mode by default, which blocks a worker slot for the entire duration. If timeout is not set or is too short, the sensor either runs forever or fails prematurely.
Step-by-Step Fix
1. Set both poke_interval and timeout
WRONG — missing or zero timeout:
sensor = S3KeySensor(
task_id="wait_for_s3_file",
bucket_key="data/input.csv",
poke_interval=60, # Checks every 60s
# timeout not set — waits forever!
)
RIGHT — set an appropriate timeout:
sensor = S3KeySensor(
task_id="wait_for_s3_file",
bucket_key="data/input.csv",
poke_interval=30, # Check every 30 seconds
timeout=7200, # Stop after 2 hours
soft_fail=True, # Skip instead of fail on timeout
)
2. Use mode="reschedule" for long-running sensors
WRONG — using default mode="poke" for sensors that wait hours:
sensor = S3KeySensor(
task_id="wait_for_file",
poke_interval=300, # 5 minutes
timeout=86400, # 24 hours
# mode="poke" — blocks a worker slot for 24 hours!
)
RIGHT — use reschedule mode:
sensor = S3KeySensor(
task_id="wait_for_file",
poke_interval=300,
timeout=86400,
mode="reschedule", # Frees the worker slot between pokes
)
In reschedule mode, the task releases its slot between checks, so workers can Process other tasks.
3. Use efficient poke_interval
WRONG — checking too frequently:
poke_interval=5 # Every 5 seconds — unnecessary for file wait
RIGHT — match to the expected availability:
# For a file expected within 1 hour
poke_interval=60 # Every minute is fine
# For a file expected within 24 hours
poke_interval=300 # Every 5 minutes is sufficient
4. Use deferrable operators
Airflow 2.2+ supports deferrable operators that use async triggers:
sensor = S3KeySensor(
task_id="wait_for_file",
bucket_key="data/input.csv",
deferrable=True, # Uses async trigger (no worker slot)
poke_interval=60,
timeout=86400,
)
This is the most efficient approach — zero worker slot usage while waiting.
5. Implement a custom sensor with exponential backoff
class BackoffFileSensor(BaseSensorOperator):
def __init__(self, filepath, max_wait=86400, **kwargs):
super().__init__(**kwargs)
self.filepath = filepath
self.max_wait = max_wait
def poke(self, context):
elapsed = (datetime.utcnow() - context["task_instance"].start_date).total_seconds()
if elapsed > self.max_wait:
return True # Stop waiting
return os.path.exists(self.filepath)
6. Handle external task sensor timeout
wait_for_dag = ExternalTaskSensor(
task_id="wait_for_other_dag",
external_dag_id="upstream_dag",
external_task_id="final_task",
timeout=3600,
allowed_states=["success"],
failed_states=["failed", "skipped"],
execution_delta=timedelta(hours=1), # Look for specific execution date
)
Expected output: sensor completes when the condition is met, or fails gracefully on timeout.
Prevention
- Always set
timeouton sensor tasks. - Use
mode="reschedule"for sensors that may wait longer than a few minutes. - Use deferrable operators when available (Airflow 2.2+).
- Set
soft_fail=Trueso timeout doesn't cause a DAG failure. - Monitor sensor tasks with alerts if they approach the timeout.
Common Mistakes with sensor timeout
- Forgetting
deriving (Show, Eq)on custom data types needed for debugging - Placing the wildcard pattern first in case expressions, making all subsequent patterns unreachable
- Using
headandtailinstead of pattern matching, causing runtime errors on empty lists
These mistakes appear frequently in real-world AIRFLOW code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.
Practice Exercise
Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.
This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro