apache/airflow

alint expresses Airflow's 101-provider-package layout invariants in 25 lines of YAML — the same invariants that today live in 1085 lines of Python that has to spin up a docker container.

Narrative
Replaces N hand-rolled validation scripts
Rules
75
Last revalidated
Engineering reference
README on GitHub · .alint.yml

Why this case study matters

Apache Airflow is the canonical Python-ecosystem expression of the alint problem: a sprawling, hand-written shell-and-Python pipeline validating structural invariants nobody can enumerate from a single file. 109 pre-commit hooks across 14 repo blocks, of which ~80 are repo: local shell-outs to Python scripts under scripts/ci/prek/ (124 scripts in that dir, with extras in breeze + in_container).

For every Python infrastructure team — and especially the data-engineering / Airflow user community — this is what their structural-validation surface looks like in the wild. alint replaces ~30 of those 109 hooks with one declarative file plus the bundled compliance/apache-2@v1 ruleset. A single place to look for “what does airflow consider a valid provider package” instead of chasing eight pre-commit IDs into eight Python files.

This is the strongest piece of evidence for the Python-ecosystem launch positioning in the corpus.

Headline catch

The most alint-shaped surface in the entire airflow tree is providers/: 101 provider distributions, each obeying the exact pattern alint was built to express.

for_each_file: providers/**/provider.yaml plus a nested require: block of file_exists / dir_exists + a couple of yaml_path_matches rules covers what airflow today enforces with the 1085-line scripts/in_container/run_provider_yaml_files_check.py — a script that has to spin up a docker container, import every provider package via Python, and walk ProvidersManager. Cold runtime: 30-60 seconds. alint’s equivalent: under 1 second on the same checkout.

The ~50× speedup on this subset is the headline perf number.

Where alint earns its keep here

Future story angles

The factual engineering writeup (tooling inventory, mapping table, gap catalogue, validation status footer) lives in the public alint repo at github.com/asamarts/alint/tree/main/examples/apache-airflow/README.md.