[AAASM-2189] 🐛 (ci): Install protoc in maturin wheel builds (Linux + macOS)#68
Conversation
The v0.0.1-alpha.2 dry-run surfaced this in the manylinux wheel
build for x86_64 and aarch64:
error: failed to run custom build command for `aa-proto v0.0.1 ...`
Caused by: Could not find `protoc`. If `protoc` is installed, try
setting the `PROTOC` environment variable to the path of the
`protoc` binary. To install it on Debian, run
`apt-get install protobuf-compiler`.
`aa-proto` (a git dependency on agent-assembly) has a build.rs that
uses prost-build, which requires the protoc compiler. The manylinux2014
image (CentOS 7-based; manylinux_2_28+ uses Rocky/Alma) and the
GitHub macOS runners do not ship protoc by default.
Fix:
* Linux (manylinux: auto):
`before-script-linux: yum install -y protobuf-compiler || dnf install ...`
Runs inside the manylinux container before maturin invokes cargo.
yum covers manylinux2014; dnf covers manylinux_2_28+.
* macOS (macos-14 / macos-13 runners):
A regular `run: brew install protobuf` step before the
maturin-action invocation. macOS doesn't have docker, so
before-script-linux is not applicable.
Both x86_64 and aarch64 Linux variants, plus both macOS variants,
are patched uniformly.
Tracked: AAASM-2109
Local verification surfaced a bug in my prior fix: the CentOS 7-based manylinux2014 image's `protobuf-compiler` package ships protoc 2.5.0 which only supports proto2 syntax. aa-proto's .proto files use proto3, producing: test.proto:1:10: Unrecognized syntax identifier "proto3". This parser only recognizes "proto2". Fix: download the official protoc binary release (v32.1) from GitHub in the before-script-linux block instead of relying on the OS package. Works for both x86_64 and aarch64 manylinux variants. Verified locally inside `quay.io/pypa/manylinux2014_x86_64:latest`: * curl + unzip protoc-32.1-linux-x86_64.zip → libprotoc 32.1 * proto3 sample compile → ✓ OK The macOS step (`brew install protobuf`) was already correct — homebrew tracks latest protoc. Tracked: AAASM-2189
Claude Code review — AAASM-2189CI state7 SUCCESS + 6 SKIPPED, 0 failures —
All 7 actually-applicable checks pass: parse-config, intent-parse, Dockerfile-check, build-git-tag-and-create-release, Python Package test, Validation Summary, Validation Summary (top-level). Important note about the wheel-build verificationThe That gap is covered by local end-to-end verification I ran on master before pushing: The full chain runs cleanly: protoc 32.1 installs, aa-proto's build.rs (which uses prost-build for proto3) succeeds, the entire dep graph including pyo3-ffi compiles, and a valid manylinux wheel lands in dist/. Scope vs. acceptance criteria
Commit history
The two-commit history is the right trail of the work — the first commit's design (yum) was caught by local verification before merge, and replaced with the binary-download approach in the second commit. VerdictReady for human approval and merge. Local end-to-end verification (protoc install + maturin build → manylinux wheel produced) covers the gap that the PR-CI can't reach. The actual release pipeline's wheel-build verification will happen on the next tag push. — Claude Code (Opus 4.7, 1M context) |
Supply-chain hardening per senior code review:
* SHA256 verification: refuse to install protoc if the downloaded
zip doesn't match the expected SHA256 (cross-verified against
the GitHub release API's `digest` field on v32.1 assets).
* Retry: --retry 3 --retry-delay 5 on curl so a single network
blip doesn't fail the build.
* Centralized env: PROTOC_VERSION + PROTOC_SHA256_X86_64 +
PROTOC_SHA256_AARCH_64 all at workflow level. Bumping protoc
is one edit (update version + 2 SHAs at the top).
* GH expression interpolation (`${{ env.X }}`) instead of shell
`$VAR` references. Substitutes at workflow-parse time before
the script reaches the manylinux container, so we don't rely
on maturin-action propagating env vars through `docker run`.
Without this gate, the previous commit downloaded an arbitrary
binary and ran it as root inside CI with no integrity check.
Verified locally end-to-end in quay.io/pypa/manylinux2014_x86_64:
* good SHA → install succeeds, protoc --version returns libprotoc 32.1
* tampered zip → rejected before install (exit 1, no execution)
* bad SHA in script → exit 1, no install
SHAs:
protoc-32.1-linux-x86_64.zip : e9c129c176bb7df02546c4cd6185126ca53c89e7d2f09511e209319704b5dd7e
protoc-32.1-linux-aarch_64.zip: 4a802ed23d70f7bad7eb19e5a3e724b3aa967250d572cadfd537c1ba939aee6a
(macOS path remains brew install protobuf — that's a separate trust
boundary handled by homebrew's own signing model. Further hardening
to also use a binary download on macOS is a follow-up if needed.)
Tracked: AAASM-2189
Claude Code follow-up — supply-chain hardening (commit
|
| Scenario | Expected | Actual |
|---|---|---|
| Good SHA + clean download | Install proceeds; protoc --version returns libprotoc 32.1 |
✅ |
Downloaded zip tampered (overwritten with "garbage" after download, before SHA check) |
Script exits 1 BEFORE unzip; no install |
✅ |
| Wrong EXPECTED_SHA in script (e.g. all-zeros) | sha256sum --check exits nonzero; install rejected |
✅ |
Threat model after hardening
Before: an attacker who compromised the GitHub release artifact (or MITM'd despite HTTPS) could install arbitrary code as root in CI.
After: the attacker also needs to substitute the SHA in this workflow file at the same time. That requires a write to the python-sdk repo's main branch, which is gated by PR review.
Net: shifts the trust boundary from "GitHub release storage" to "PR review + workflow file integrity".
Operational note
Bumping protoc in the future is now a single PR touching 3 lines in env: (version + 2 SHAs). Cross-verify the new SHAs against the GitHub release API's digest field at the time of bump, same way I did here.
— Claude Code (Opus 4.7, 1M context)
Description
The v0.0.1-alpha.2 dry-run's
Release Python SDKworkflow had all 4 wheel builds fail with:aa-proto(a git dep on agent-assembly) has a build.rs that usesprost-build, which requires the protoc compiler. Neither the manylinux container nor the GitHub macOS runners ship protoc by default.Fix
Two install patterns depending on the build environment:
build-linux-x86_64(manylinux: auto)before-script-linux: yum install -y protobuf-compiler || dnf install -y protobuf-compilerto thePyO3/maturin-action@v1invocation.yumcovers manylinux2014 (CentOS 7);dnfcovers manylinux_2_28+ (Rocky/Alma).build-linux-aarch64(manylinux: auto)build-macos-arm64(macos-14 runner)run: brew install protobufstep BEFORE the maturin-action. macOS doesn't have docker;before-script-linuxdoesn't apply.build-macos-x86_64(macos-13 runner)build-sdistLocal verification
actionlint .github/workflows/release-python.ymlclean (pre-existingmacos-13runner warning is unrelated — Apple retired that label from GitHub-hosted runners; same warning exists on master)before-script-linuxis a documentedPyO3/maturin-action@v1input (ref: https://github.com/PyO3/maturin-action#inputs)Note on CI
Per maintainer direction, GitHub Actions billing limit is hit — CI won't run on this PR. The fix is verified by inspection + memory
project_aa_cli_compat_protoc_requirement.mdconfirms this is the same root cause we've fixed in agent-assembly CI before. Re-running once billing resets is the verification.Related Issues
Type of Change
— Claude Code (Opus 4.7, 1M context)