Add on_audit_log_created listener hookspec#66432
Open
john-jac wants to merge 1 commit intoapache:mainfrom
Open
Add on_audit_log_created listener hookspec#66432john-jac wants to merge 1 commit intoapache:mainfrom
john-jac wants to merge 1 commit intoapache:mainfrom
Conversation
Add a new listener hookspec that fires when an audit log record is created via the action_logging decorator. This enables plugins and providers to react to audit events (forwarding to external systems, triggering alerts, etc.) without polling the REST API. Ref: apache#66018
dcd0e0e to
1204733
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a new listener hookspec
on_audit_log_createdthat fires whenever an audit log record is persisted via the API action logging system. This enables plugins and providers to react to audit events such forwarding to external logging/SIEM systems, triggering alerts, or enriching observability pipelines, without polling the REST API.Ref: #66018
Motivation
Airflow's listener system currently covers task instances, DAG runs, assets, and lifecycle events. Audit logs, i.e. the records of who did what (paused a DAG, modified a variable, deleted a connection), are the notable gap. Today the only way to consume these externally is to poll the
/eventLogsAPI endpoint, which introduces latency, requires dedicated infrastructure (Lambda, cron job, etc.), and doesn't scale well for real-time compliance requirements.This is a recurring community request:
action_loggingto connection/variable endpoints, solving what gets audited but not how to forward those events externally.This change brings audit logs to parity with other first-class Airflow events by exposing them through the same pluggy-based listener mechanism that providers and plugins already use.
Scope
This PR fires the hook only from the
action_loggingdecorator i.e., user-initiated API actions (pause/unpause DAGs, create/delete connections, modify variables, trigger DAG runs, clear task instances, etc.). It does not fire for high-frequency system-generated log entries (task state transitions, scheduler state mismatches), which are operational telemetry rather than audit events and already have dedicated hooks (on_task_instance_running,on_task_instance_failed, etc.).Changes
airflow/listeners/spec/audit_log.pyairflow/listeners/listener.pyapi_fastapi/logging/decorators.pyafter the DB commitFAQ
Q: Why not handle this in a provider package without a core change?
A: The only current integration point is the REST API, which requires external polling infrastructure and introduces latency. As demonstrated in Discussion #34142, even attempting to read audit logs from within a Listener plugin today requires querying the database directly — which causes HA lock errors and is fragile across version upgrades. The listener system is Airflow's intended extension mechanism for cross-cutting event-driven concerns. Assets, DAG runs, and task instances all have listener hooks; audit logs are the missing piece.
Q: What about performance? This runs on every API mutation.
A: The hook fires after the DB commit, so it cannot affect the persistence of the audit record itself. If no listeners are registered, the cost is a single no-op function call. API mutations are human-rate operations — orders of magnitude less frequent than task heartbeats or scheduler loops.
Q: Why not use a SQLAlchemy
after_insertevent on theLogmodel instead?A: SQLAlchemy ORM events run inside the active transaction. A slow or failing listener would block the commit or cause deadlocks -- exactly the "HA lock errors" reported in Discussion #34142. The pluggy listener system runs outside the transaction boundary.
Q: Why not just emit to a standard Python
logging.Logger?A: A Python log record is an unstructured string. The
Logmodel object carries structured fields (event,dag_id,task_id,owner,extra,dttm) that consumers need for filtering, routing, and compliance reporting.Q: Why not also fire for task state changes and scheduler events?
A: Those are high-frequency system-generated events with dedicated hooks already. This PR scopes to user-initiated actions i.e. the events that answer "who changed what and when" for compliance purposes.
Q: Who would consume this?
A: CloudWatch, Datadog, Splunk, Elasticsearch/OpenSearch, Azure Monitor, GCP Cloud Logging, OpenTelemetry collectors, custom compliance webhooks, Kafka-based audit pipelines.
Q: Is this a breaking change?
A: No. Purely additive. Existing deployments are unaffected unless they register a listener for the new hook.