This topic lists all the metrics that Airflow can generate over the StatsD.

The following table lists the metrics for Airflow Gauges.
Name Description
dagbag_size Number of DAGs found when the scheduler ran a scan based on its configuration.
dag_processing.import_errors Number of errors found while trying to parse DAG files.
dag_processing.total_parse_time Seconds taken to scan and import all DAG files together.
dag_processing.last_run.seconds_ago.<dag_file> Seconds since <dag_file> was last processed.
scheduler.tasks.running Number of tasks running in an executor.
scheduler.tasks.starving Number of tasks that cannot be scheduled because of no open slot in pool.
scheduler.tasks.executable Number of tasks that are ready for execution (set to queued) with respect to pool limits, dag concurrency, executor state, and priority.
executor.open_slots Number of open slots on executor.
executor.queued_tasks Number of queued tasks on executor.
executor.running_tasks Number of running tasks on executor.
pool.open_slots.<pool_name> Number of open slots in the pool.
pool.queued_slots.<pool_name> Number of queued slots in the pool.
pool.running_slots.<pool_name> Number of running slots in the pool.
pool.starving_tasks.<pool_name> Number of starving tasks in the pool.
triggers.running Number of triggers currently running (per triggerer).
The following table lists the metrics for Airflow Counters
Name Description
<job_name>_start Number of started <job_name> job. For example, SchedulerJob, LocalTaskJob.
<job_name>_end Number of ended <job_name> job. For example, SchedulerJob, LocalTaskJob.
<job_name>_heartbeat_failure Number of failed Heartbeats for a <job_name> job. For example, SchedulerJob, LocalTaskJob.
local_task_job.task_exit.<job_id>.<dag_id>.<task_id>.<return_code> Number of LocalTaskJob terminations with a <return_code> while running a task <task_id> of a DAG <dag_id>.
operator_failures_<operator_name> Operator <operator_name> failures.
operator_successes_<operator_name> Operator <operator_name> successes.
ti_failures Overall task instance failures.
ti_successes Overall task instance successes.
previously_succeeded Number of previously succeeded task instances.
zombies_killed Zombie tasks killed.
scheduler_heartbeat Scheduler heartbeats.
dag_processing.processes Number of currently running DAG parsing processes.
dag_processing.processor_timeouts Number of file processors that have been killed due to long time.
dag_file_processor_timeouts Number of file processors that have been killed due to long time.
dag_processing.manager_stalls Number of stalled DagFileProcessorManager.
dag_file_refresh_error Number of failures loading any DAG files.
scheduler.tasks.killed_externally Number of tasks killed externally.
scheduler.orphaned_tasks.cleared Number of Orphaned tasks cleared by the Scheduler.
scheduler.orphaned_tasks.adopted Number of Orphaned tasks adopted by the Scheduler.
scheduler.critical_section_busy Count of times a scheduler process tried to get a lock on the critical section (needed to send tasks to the executor) and found it locked by another process.
sla_missed Number of SLA misses.
sla_callback_notification_failure Number of failed SLA miss callback notification attempts.
sla_email_notification_failure Number of failed SLA miss email notification attempts.
ti.start.<dag_id>.<task_id> Number of started task in a given dag. Similar to <job_name>_start but for task.
ti.finish.<dag_id>.<task_id>.<state> Number of completed task in a given dag. Similar to <job_name>_end but for task.
dag.callback_exceptions Number of exceptions raised from DAG callbacks. When this happens, it means DAG callback is not working.
celery.task_timeout_error Number of AirflowTaskTimeout errors raised when publishing Task to Celery Broker.
celery.execute_command.failure Number of non-zero exit code from Celery task.
task_removed_from_dag.<dag_id> Number of tasks removed for a given dag (i.e. task no longer exists in DAG).
task_restored_to_dag.<dag_id> Number of tasks restored for a given dag (i.e. task instance which was previously in REMOVED state in the DB is added to DAG file).
task_instance_created-<operator_name> Number of tasks instances created for a given Operator.
triggers.blocked_main_thread Number of triggers that blocked the main thread (likely due to not being fully asynchronous).
triggers.failed Number of triggers that errored before they could fire an event.
triggers.succeeded Number of triggers that have fired at least one event.
The following table lists the metrics for Airflow timers.
Name Description
dagrun.dependency-check.<dag_id> Milliseconds taken to check DAG dependencies.
dag.<dag_id>.<task_id>.duration Milliseconds taken to finish a task.
dag_processing.last_duration.<dag_file> Milliseconds taken to load the given DAG file.
dagrun.duration.success.<dag_id> Seconds taken for a DagRun to reach success state.
dagrun.duration.failed.<dag_id> Milliseconds taken for a DagRun to reach failed state.
dagrun.schedule_delay.<dag_id> Seconds of delay between the scheduled DagRun start date and the actual DagRun start date.
scheduler.critical_section_duration Milliseconds spent in the critical section of scheduler loop, only a single scheduler can enter this loop at a time.
scheduler.critical_section_query_duration Milliseconds spent running the critical section task instance query.
scheduler.scheduler_loop_duration Milliseconds spent running one scheduler loop.
dagrun.<dag_id>.first_task_scheduling_delay Seconds elapsed between first task start_date and dagrun expected start.
collect_db_dags Milliseconds taken for fetching all Serialized Dags from DB.