Three small improvements to Airflow's PostgreSQL-compatible database handling — pre-PR sanity check #65453

viragtripathi · 2026-04-18T01:15:55Z

viragtripathi
Apr 18, 2026

Hi Airflow maintainers,

I'm working on improving Airflow 3.x's behavior on PostgreSQL-compatible databases (currently focused on CockroachDB, which is wire-protocol compatible with PG). Before opening individual PRs I wanted to surface the proposed changes here for a sanity check, since two of the three are dialect-name additions that touch generic code paths.

I'd appreciate maintainer guidance on:

Are these acceptable as standalone PRs against main?
Should this go to dev@airflow.apache.org as [DISCUSS] first?
Anything in the framing or scoping I should change?

I've read #46175 and want to be explicit up front: I am not asking Airflow to take on a new metadata-backend. The dialect-specific work (SQLAlchemy compiler hooks, retry logic, migration audit) lives entirely outside Airflow — see context below. The three items in this post are improvements that stand on their own merits regardless of whether anyone uses CockroachDB.

Context (skip if uninterested)
Background work that's already done outside this repo:

Integration POC validating Airflow 3.2.0 against CockroachDB v25.4 LTS — airflow db migrate succeeds, DAG parsing/scheduling/task execution all work, example DAGs run end-to-end. 111 Alembic migrations audited, 12 findings, all handled.
sqlalchemy-cockroachdb PR that registers @compiles(timestampdiff, "cockroachdb") so MySQL-style func.timestampdiff(unit, start, end) (the fallback used in airflow-core/src/airflow/models/taskinstance.py and dagrun.py for non-PG/non-SQLite dialects) compiles to a PostgreSQL-style EXTRACT(EPOCH FROM …) expression: cockroachdb/sqlalchemy-cockroachdb#301. This means Airflow needs no code change for the timestampdiff issue once the dialect ships 2.0.4+.
So the items below are the remaining ones that genuinely require small upstream changes.

Proposed PR 1 — Generic 40001 serialization-failure retry in the scheduler
Files: airflow-core/src/airflow/utils/retries.py (decorator fix) and airflow-core/src/airflow/jobs/scheduler_job_runner.py (apply to _critical_section_enqueue_task_instances).

Why this is a generic improvement, not CRDB-specific: PostgreSQL also raises SerializationFailure (SQLSTATE 40001) under high concurrency on SELECT … FOR NO KEY UPDATE … SKIP LOCKED. Today Airflow's scheduler does not retry these — it either crashes the process or enters an error loop. #40882 (closed) noted that the existing retry_db_transaction decorator only issues session.rollback() for OperationalError, not all DBAPIError subclasses, leaving subsequent retries to fail with InFailedSqlTransaction.

Proposed change:

Fix retry_db_transaction to roll back on any DBAPIError, not just OperationalError.
Apply it (or an equivalent retry loop) to _critical_section_enqueue_task_instances.
Use exponential backoff and bound the number of retries.
Risk: Medium. Touches the scheduler hot path. Change should be guarded by tests that fault-inject 40001 and assert progress.

Benefit: Improves PG resilience under concurrent scheduler workloads. Side benefit: makes the scheduler usable on any PG-compatible DB whose isolation model surfaces 40001.

This is the largest of the three changes and the one I most want a steer on before drafting.

Proposed PR 2 — Add cockroachdb to the async-driver mapping
File: airflow-core/src/airflow/settings.py:240

def _get_async_conn_uri_from_sync(sync_uri):
AIO_LIBS_MAPPING = {"sqlite": "aiosqlite", "postgresql": "asyncpg", "mysql": "aiomysql"}
...
Proposed change: Add "cockroachdb": "asyncpg" to the mapping.

Why standalone-justifiable: This function exists to derive an async URI when one isn't explicitly configured. Today, anyone using a cockroachdb:// SQLAlchemy URL (the canonical scheme registered by sqlalchemy-cockroachdb since 2017) hits The asyncio extension requires an async driver. The loaded 'psycopg2' is not async. The workaround is setting AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_ASYNC explicitly. One-line fix; no behavior change for postgres/mysql/sqlite users.

Risk: Trivial. Touches one dict literal.

Proposed PR 3 — Dialect-aware UUID generation in migration 0042
File: airflow-core/src/airflow/migrations/versions/0042_3_0_0_add_uuid_primary_key_to_task_instance_.py

The migration creates a custom uuid_generate_v7() SQL function via CREATE EXTENSION pgcrypto. CockroachDB doesn't expose pgcrypto extensions but offers native gen_random_uuid().

Proposed change: Add a dialect-name check at the top of the upgrade so non-postgres PG-compatible dialects can supply their own UUID generator (defaulting to gen_random_uuid() for cockroachdb). PG users see no change. The fallback uses v4 UUIDs (not v7) on the alternate dialect, which is documented as a tradeoff.

Risk: Low. Migration runs once, dialect-name-guarded, no impact on existing PG installs.

The CRDB-specific compatibility work (compiler hooks, alembic impl, retry middleware) lives in sqlalchemy-cockroachdb. The three items above are the only places where small generic changes in Airflow itself would close the gap meaningfully.

Disclosure
I work at Cockroach Labs. The integration POC and sqlalchemy-cockroachdb#301 are linked above for full transparency. Happy to share any additional context. Thanks for reading.

potiuk · 2026-04-18T08:18:05Z

potiuk
Apr 18, 2026
Collaborator

Yes. You should start this discussion on Devlist. See https://airflow.apache.org/community/

1 reply

viragtripathi Apr 20, 2026
Author

Thanks @potiuk — much appreciated. Started the [DISCUSS] thread on dev@. I also published the runnable POC for anyone who wants to validate end-to-end: https://github.com/viragtripathi/cockroachdb-airflow-poc.
Happy to take any further feedback on the list or here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Three small improvements to Airflow's PostgreSQL-compatible database handling — pre-PR sanity check #65453

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Three small improvements to Airflow's PostgreSQL-compatible database handling — pre-PR sanity check #65453

Uh oh!

viragtripathi Apr 18, 2026

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

potiuk Apr 18, 2026 Collaborator

Uh oh!

Uh oh!

viragtripathi Apr 20, 2026 Author

viragtripathi
Apr 18, 2026

Replies: 1 comment 1 reply

potiuk
Apr 18, 2026
Collaborator

viragtripathi Apr 20, 2026
Author