Why this case study matters
protocolbuffers/protobuf is the densest polyglot binding repo on
the OSS evidence list — a single tree shipping the protoc compiler
(C++) plus runtime + codegen for ~10 in-tree language bindings,
glued together by conformance/ (the cross-language wire-format
test suite that EVERY binding must pass) and the canonical version
manifests version.json + protobuf_version.bzl (lock-stepping
every binding’s released version).
Broader appeal extends to every multi-language binding repo — gRPC, Thrift, Avro, Cap’n Proto, Substrait, Apache Iceberg, Apache Beam SDKs — anywhere a single tree fans out a wire-format spec across many language runtimes.
Headline catch
protobuf is the densest single-repo source for the v0.11+
cross_language_implementation_complete ship-target. Quantitatively:
- 10 in-tree language bindings (C++ runtime + protoc compiler,
Java with 3 Maven sub-modules, Python, Ruby, Objective-C, C#,
PHP with PHP-C extension, Rust, plus C/C++ sister bindings hpb
and upb) + 1 spun-out (Dart, in
dart-lang/protobuf.dartbut still covered by the conformance suite viafailure_list_dart_upb.txt). - ~45 cross-language assertions in one rule block — 10 bindings × 4-5 parity surfaces each (per-binding wire-format runner + failure-allowlist + text-format failure-allowlist + CI workflow + version pin in 2 manifests).
- Every binding has at least 3 parity surfaces; the canonical six (cpp, java, python, ruby, php, rust) have all five. 5 saturated demand sources for the candidate (apache/arrow + tensorflow/tensorflow + protobuf + angular + flutter), making the v0.11 design phase ship-ready.
The 19 failure_list_<lang>.txt files also push ordered_block
forward: protobuf is the 7th source for v0.10 (rust + airflow
- tokio + cpython + arrow + golang/go + protobuf failure_lists),
tied with
registry_paths_resolveat the top of the v0.10 backlog.
Where alint earns its keep here
- The polyglot parity narrative, end-to-end. Per-language linters (clang-format, gofmt, rubocop, flake8, buildifier) only see their own binding subtree. The conformance discipline + version-pin drift across 10 bindings is invisible to all of them at once. alint surfaces it in 108 rules across one declarative config — every binding’s failure-list, version-manifest entry, and CI workflow asserted in ~5 lines of YAML each.
- Differentiator vs. apache/arrow — arrow has a single-direction
parity shape (every per-language implementation conforms to the
format/schema spec). protobuf is denser: per-binding conformance runner + per-binding failure_list + per-binding text-format failure_list + per-binding test workflow + per-binding version pin in 2 manifests. Arrow stress-tests one parity surface; protobuf stress-tests five parity surfaces simultaneously. - The Bazel angle — protobuf is a Bazel-built repo with both
MODULE.bazel + WORKSPACE at root. alint complements
buildifierat the file-structure layer (137 BUILD.bazel + 117 *.bzl files); same “alint owns file-shape, buildifier owns Starlark AST” division of labour as bazelbuild/bazel. - Conformance test runners + failure-lists are exactly the
shape per-language linters can’t see; one missing
failure_list_<lang>.txtsilently drops that binding’s coverage from the cross-language suite.
Future story angles
- v0.11 design-phase launch post — pair protobuf (densest source) with apache/arrow + TF + angular + flutter for the 5-saturated demand-driver narrative.
- Bazel-native repo angle — pair with bazelbuild/bazel itself for the “alint owns file-shape, buildifier owns Starlark AST” story.
- gRPC adoption pitch — direct sibling repo (protobuf’s Google- internal partner) shares the polyglot binding shape. Likely the most natural follow-on case study.
ordered_blockv0.10 launch post — protobuf’s 27 unsorted failure_list + text_format_failure_list files are the canonical demand-driver alongside cpython’sModules/Setup.