Release-signing toolchain: HSM-backed OpenPGP workflow#240
Open
quarckster wants to merge 17 commits into
Open
Conversation
Operator-facing helper for the release-artifact OpenPGP signing policy
on top of sq-pkcs11 (which speaks PKCS#11 to the nShield HSM).
Capabilities:
- generate the policy primary key (RSA-4096, OCS-protected) and
current signing subkey (RSA-4096, module-protected) via nShield
generatekey
- issue / rotate the published OpenPGP certificate (5-year primary,
1-year subkey, with --merge-cert for subkey rotation that
preserves predecessor subkeys)
- sign release artifacts with the current signing subkey
- issue primary-key and subkey-revocation certificates
Each command validates only the env vars it actually consumes (e.g.
subkey-rotate doesn't demand OPENSSL_PGP_CURRENT_SUBKEY_LABEL), and
every label / cardset value is checked against the Security World via
cklist / nfkminfo before any HSM action runs, so a typo cannot leave a
half-finished state on the HSM.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A gpg-CLI-compatible signing shim that translates git's gpg.program invocation shape (gpg --status-fd=N -bsau <keyid> < tagbody) into a sq-pkcs11 sign call against an HSM-resident key identified by CKA_LABEL. The shim drains stdin to a temp file (sq-pkcs11 takes a path, not stdin), runs sq-pkcs11 sign --output - so the armored signature streams straight back on stdout where git expects it, and emits a SIG_CREATED status line for older git versions that parse GnuPG's status protocol. Anything that isn't a sign operation (verify, decrypt, list-keys, ...) is forwarded to a real gpg via OPENSSL_PGP_FALLBACK_GPG (default "gpg"), so `git tag -v` and similar continue to work on the operator's machine. Used by stage-release.sh via --gpg-program; can also be configured directly with `git config gpg.program <path>` for non-release tag signing scenarios. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add --gpg-program=<path> to redirect git tag signing through a custom program by invoking `git -c gpg.program=<path> tag -s ...`. Direct gpg invocations elsewhere in this script (tarball detached signature, announcement clearsign) are deliberately untouched; combine with --unsigned if those should be skipped and signed out-of-band. Primary motivation is to route tag signing through release-tools/sq-pkcs11-git-shim so the release tag can be signed by an HSM-resident private key without involving the local gpg keyring. The option is general — it's just a `gpg.program` override — so any gpg-CLI-compatible signer can be plugged in. When --gpg-program is used together with --local-user, the keyid form expected by --local-user follows whatever the configured program expects (key id / fingerprint for gpg, CKA_LABEL for the sq-pkcs11 shim). This is documented in the help text and the manual. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sq-pkcs11's subkey-revoke API changed: instead of identifying the subkey by CKA_LABEL (which required HSM access to the subkey's private key — broken for the compromise scenario), the new shape takes --input-cert + --subkey-fingerprint and only exercises the primary's private key. The wrapper follows. Operational change: callers of \`openssl-pgp subkey-revoke\` now pass the subkey fingerprint they want revoked, looked up from the published cert via \`sq inspect release.asc\` or \`gpg --list-keys --with-subkey-fingerprint\`. The lost / compromised subkey path no longer requires the HSM to still hold its private key. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this change, openssl-pgp sign would happily use whatever HSM key OPENSSL_PGP_CURRENT_SUBKEY_LABEL pointed at, as long as it was RSA-4096 and present on the HSM. A typo'd or stale label could therefore produce a "valid"-looking release signature that fails verification against the published cert — exactly the kind of error a release-engineering tool should refuse to make. Add a pre-flight call to `sq-pkcs11 verify-signing-key`, before any real signing operation, that confirms the configured HSM key is a current valid signer of $OPENSSL_PGP_CERT (alive, not revoked, signing-flagged binding under Sequoia's StandardPolicy). On mismatch, openssl-pgp refuses with a message naming the offending env var; no HSM signing operation is consumed. require_cert_exists $OPENSSL_PGP_CERT is also added — sign now needs a published cert on disk to validate against (it was previously unused on the sign path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A small wrapper invoked as `openssl-pgp-ceremony-run <state-dir> <cmd> [args...]` that runs an OpenPGP ceremony command, tees its combined stdout/stderr to <state-dir>/ceremony.log, and persists the final exit status to <state-dir>/rc. The state directory acts as a handshake point between the orchestrator that starts the ceremony and the operator who attaches to type passphrases: the orchestrator polls for `rc` to know when the command has finished and what status it returned, while the log file gives post-hoc audit material. This commit introduces the bare runner shape only; the tmux-based session sharing for attended OCS passphrase entry comes in the next commit so the skeleton is reviewable on its own.
…stration Restructure the runner into two subcommands: start-and-wait: invoked by an automation server (Jenkins agent). Creates a detached tmux session on the HSM client via a named socket, runs the requested OpenPGP command inside that session, prints attach instructions, and waits for the command's exit status (polls the state directory's rc file with a configurable --timeout). exec: the tmux-side runner that actually executes the command, records output to ceremony.log, and writes the exit status to rc. Equivalent to the original single-mode behaviour and used as the tmux session command by start-and-wait. This split keeps the automation server as the orchestrator and audit collector only — it sees the command, logs, outputs, and final exit status, but OCS passphrases are typed into the shared tmux terminal by custodians (over SSH + nShield Remote Administration / TVD), never passed through CI parameters, credentials, or a web UI. The --allow-user flag uses `tmux server-access -a/-w` to grant named users read/write access to the session via the shared socket. Socket permissions are set to 0660 with the group inherited from the socket directory so attaching from a fixed Unix group is sufficient — no chmod 0666 / world-readable sockets.
Switch generatekey_pkcs11 to invoke `generatekey --generate --batch
pkcs11 ...` so the operator is not prompted to confirm each parameter
on stdin before the OCS prompts appear. Two scenarios where this
matters:
- Automated cert-init runs from Jenkins, where there is no
interactive operator at the generatekey step; the wrapper has
already validated every parameter (labels, cardset existence,
key-size, logkeyusage policy) before calling generatekey.
- Tmux-shared ceremonies via openssl-pgp-ceremony-run, where the
expected operator interaction is OCS card presentation and
passphrase entry — not parameter confirmation. Without --batch
the custodians would have to press Enter through generatekey's
parameter-review prompt before the card prompts appeared,
cluttering the shared terminal with non-decision input.
The arguments fed into generatekey come exclusively from the wrapper
(constructed from validated env vars and CLI flags), so the
interactive confirmation step --batch suppresses is redundant in
this context. The note() string is updated to reflect the new flag
for parity with what the operator will see in the ceremony log.
A helper invoked as `openssl-pgp-revocation-recipients bundle --source
<dir> --bundle <out> --manifest <out> --bundle-sha256 <out>` that
collects the trusted set of public OpenPGP certificates authorised to
decrypt the offline primary-key revocation certificate produced by
cert-init.
The source directory contains:
- recipients.txt — authoritative manifest, one
"<40-hex fingerprint> <email>" line per recipient
- <fingerprint>.pgp — one file per manifest entry, the recipient's
public certificate
For each manifest entry the helper:
- confirms the cert file exists and is non-empty
- runs `sq inspect` and asserts both the declared fingerprint and
User ID email are present in the cert (catches stale fingerprints
or accidentally-swapped files)
- test-encrypts a probe message via `sq encrypt --for-file` to
confirm the cert is currently usable for encryption under sq's
policy (catches expired keys, policy-rejected ciphersuites, etc.)
- appends the cert to the output bundle and the (fingerprint, email)
pair to the output manifest, finally writing the bundle's SHA-256
The cert-init Jenkins pipeline runs this helper before the ceremony
so the encrypted offline-revocation artifact can only be produced if
the recipient set is fully valid — no half-encrypted artifacts
slipping through if one recipient's key has expired or been
withdrawn since the last run.
Three initial recipients are checked in under
release-tools/openpgp/revocation-recipients/ with a README documenting
the manifest format and the validation contract above.
…rant `tmux server-access -a <user>` fails when invoked for the user who owns the running tmux server: the server starts that user implicitly in the allow-list, and tmux refuses to add a duplicate entry. In the ceremony runner, this manifested as a hard failure of the access-grant loop the moment --allow-user named the CI agent's own account. Detect the case (compare each --allow-user value to `id -un`) and skip the redundant grant. The loop then continues to grant access to the remaining custodians, leaving the session alive. Without this fix, an operator list that included the server-owning account would kill the session and bail on what is structurally a no-op.
The "Signing the release files" step shelled out to gpg, which fails on Jenkins agents that do not have the release private key in a local gpg keyring -- the key lives on the HSM. Replace the two gpg invocations (tarball detached signature, announcement clearsign) with calls to the openssl-pgp wrapper, which signs via sq-pkcs11 against the HSM key. Also default --gpg-program to release-tools/sq-pkcs11-git-shim when not explicitly set, so git tag signing goes through the HSM by the same mechanism; the flag is still overridable (e.g. --gpg-program=gpg) for operators using a local gpg keyring. --local-user now also exports OPENSSL_PGP_CURRENT_SUBKEY_LABEL so the openssl-pgp subprocesses pick up the same key without a second env-var hop. openssl-pgp / sq-pkcs11 produce detached, ASCII-armored signatures only, so the announcement is no longer cleartext-signed. Both the plain .txt and the detached .txt.asc are uploaded; FILES section of the manual is updated to reflect this. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The script grew options over the years that the Jenkins release job
does not exercise. Remove them and the code paths they gated:
--clean-worktree The script now always operates on the current
worktree (refusing to run if dirty). Drops the
sibling-clone setup and the "cd $release_clone"
dance, plus the three `git push parent HEAD`
back-pushes to the original repo.
--branch-fmt Branch and tag names follow the standard scheme
--tag-fmt (%b / %t). The two formats were always identical
under --clean-worktree, so format_string and
release-aux/string-fn.sh are no longer needed.
--staging-address No more uploads from this script -- artifacts
--no-upload land in the parent directory and the caller is
responsible for shipping them. Drops the
staging_address parser and the upload-fn.sh
backends entirely.
--no-update `make update` and `make update-fips-checksums`
always run.
--force The branch-name check is now strict.
The --debug flag no longer implies --no-upload (no upload step
exists). The metadata file's `upload_files` key becomes
`release_files`; the now-impossible staging_update_branch /
staging_release_branch keys are removed. POD manual is trimmed to
match.
The cleanup leaves the script ~400 lines shorter and the supported
invocation matches what the Jenkins job actually uses:
stage-release.sh \\
$ALPHA_FLAG $BETA_FLAG $FINAL_FLAG \\
--branch \\
--gpg-program="$GPG_PROGRAM" \\
--local-user="$SIGNING_KEY_LABEL" \\
--reviewer="$REVIEWER_1" --reviewer="$REVIEWER_2"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stage-release.sh: stop producing openssl-<version>.txt and its detached signature. The announcement-template machinery (template selection, sed expansion, fix-title.pl, the second openssl-pgp sign call) is gone, along with the `length` checksum that only the announcement consumed. The release_files manifest, FILES section of the POD, and the leftover comments in --gpg-program / tagkey header follow suit. do-copyright-year: the progress spinner used \r to overwrite a single status line, which works on a TTY but not in Jenkins -- every tick of the spinner showed up as its own log line, polluting the build output. Skip the spinner entirely when stdout is not a TTY, keeping just the "Updating copyright" / "Files considered: N" / "Files changed: N" lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tarball signing was already inlined as $RELEASE_TOOLS/openssl-pgp, but tag signing was routed through a --gpg-program flag whose only sensible value was the in-repo sq-pkcs11-git-shim. Make the pair symmetric: inline sq-pkcs11-git-shim at the `git tag -s` call and remove --gpg-program entirely (from getopt, case dispatch, --help text, and the POD manual). --unsigned still covers the skip-signing escape hatch; --local-user remains the single knob for the signing key. The Jenkins invocation can drop its --gpg-program="$GPG_PROGRAM" arg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eded
The Jenkins job passed --branch unconditionally; the existing
patch-number / same-branch checks already collapsed it to a no-op when
not applicable. Make do_branch=true the default and remove the flag.
- on master at PATCH == 0: release branch openssl-X.Y is created
(same as --branch was honored before)
- on a release branch
or PATCH != 0: release commit lands on the current branch
(same as --branch was ignored before, just
without the "--branch ignored" warning)
This removes --branch from getopt/case dispatch/--help/POD, drops the
warn_branch bookkeeping that only fed that warning, removes the
"--final implies --branch" plumbing and the "--branch is invalid
unless current branch is master" guard (the do_branch=false branch
now handles non-master worktrees identically).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
t-j-h
approved these changes
May 13, 2026
arapov
approved these changes
May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wire the OpenSSL release-signing toolchain to a PKCS#11 HSM through
sq-pkcs11, with a tmux-based ceremony runner for K/N OCS-quorum operations and pre-flight checks at every HSM interaction.Content
openssl-pgp— release-signing policy wrapper oversq-pkcs11The policy: RSA-4096 primary (Certify, OCS-protected, 5y) + RSA-4096 signing subkey (Sign, module-protected, 1y),
logkeyusage=yesfor Security World audit-log coverage, creation times derived from nShieldgentime.openssl-pgp-ceremony-run— tmux ceremony orchestrationA shared-socket tmux session runner so multiple operators can attach to the live cert-init ceremony for the OCS card prompts. Drives
openssl-pgp cert-init --generate-keysfrom the Jenkins pipeline.openssl-pgp-revocation-recipients+ checked-in recipient keysBundles the trusted set of revocation-encryption recipients into one armored OpenPGP keyring (plus a manifest and SHA256) so cert-init can encrypt the offline primary-key revocation to a known set of public keys. Recipient PGP keys committed under
release-tools/openpgp/revocation-recipients/.sq-pkcs11-git-shim+stage-release.sh --gpg-programgpg.programinvocation shape (gpg --status-fd=N -bsau <keyid>) into asq-pkcs11signcall against an HSM-resident key identified byCKA_LABEL. EmitsSIG_CREATEDfor older git versions that parse GnuPG status output. Anything other than--signis forwarded to a real gpg viaOPENSSL_PGP_FALLBACK_GPG, sogit tag -vand similar keep working locally.stage-release.sh --gpg-program=<path>redirects only git-tag signing through the supplied program (viagit -c gpg.program=...). The directgpginvocations elsewhere instage-release.sh(tarball detached signatures, announcement clearsign) are deliberately untouched — combine with--unsignedto skip those and sign them out-of-band.