feat: add blapi feature to bypass PKCS#11 in RecordProtection#52
feat: add blapi feature to bypass PKCS#11 in RecordProtection#52larseggert wants to merge 1 commit into
blapi feature to bypass PKCS#11 in RecordProtection#52Conversation
There was a problem hiding this comment.
Pull request overview
Adds an optional blapi feature that swaps RecordProtection’s AEAD hot path from PKCS#11 (PK11_AEADOp) to direct NSS freebl primitives (AES-GCM / ChaCha20-Poly1305), aiming to reduce session/mutex/hash-table overhead.
Changes:
- Introduces
src/freebl.rswith committed FFI bindings to freebl AEAD primitives and scoped pointer wrappers. - Adds a new
RecordProtectionimplementation under--features blapithat derives keys/IVs and calls freebl directly. - Adds a shared
xor_nonce()helper and wires in the new Cargo feature + build linking behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/lib.rs | Conditionally exposes the new internal freebl module when blapi is enabled. |
| src/freebl.rs | Adds FFI declarations for freebl AEAD APIs + scoped destructor wrappers. |
| src/aead.rs | Adds a blapi-gated RecordProtection implementation and refactors nonce XOR logic. |
| Cargo.toml | Declares the new blapi feature flag. |
| build.rs | Adds linkage to a freebl static library intended to support the new freebl calls. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Summary
Clean, well-structured performance optimization that bypasses the PKCS#11 session layer for RecordProtection AEAD operations via direct freebl calls. The approach is sound — the blapi feature is opt-in, correctly gated with cfg, and the implementation handles all three TLS 1.3 cipher suites (AES-128-GCM, AES-256-GCM, ChaCha20-Poly1305).
The xor_nonce extraction is a nice deduplication that makes the nonce construction reusable across both the blapi and non-blapi paths.
Blocking
build.rs: unconditionalstatic=freebllinkage — Bothdynamic_link()andpkg_config()emit the freebl link directive regardless of whetherblapiis enabled. This will break all dynamic-link builds on systems that don't havelibfreebl.ainstalled (the common case with packaged NSS). Must be gated onCARGO_FEATURE_BLAPI.
Should address
- No test coverage for the
blapipath — The existing tests exerciseAead(PKCS#11 path) and thedisable-encryptionstub. There's no validation that the freebl bypass produces correct ciphertext or round-trips. A single cfg-gated encrypt/decrypt roundtrip test (ideally against a known vector) would catch FFI signature or parameter-ordering bugs — which are the highest-risk class of error in this change.
Minor
- Feature documentation for
blapiinCargo.toml. - Structural size assertion on
CK_GCM_MESSAGE_PARAMS. - Pointer derivation style consistency in
encrypt.
Thread safety: both the blapi and original RecordProtection are !Send + !Sync (via scoped_ptr!), which is the correct behavior for non-thread-safe freebl/softoken contexts.
There was a problem hiding this comment.
Summary
Sound approach — bypassing the PKCS#11 session layer for a well-scoped hot path (AEAD encrypt/decrypt) with an opt-in feature flag is clean and low-risk. The FFI signatures match NSS's blapi.h, the scoped_ptr! wrappers handle context lifetimes correctly, and the xor_nonce extraction is a nice DRY improvement shared by both paths.
All prior review feedback has been addressed in the latest commit: build.rs linkage is now feature-gated, Cargo.toml documents the FIPS implication, the CK_GCM_MESSAGE_PARAMS size assertion is in place, the encrypt pointer derivation uses the single-as_mut_ptr pattern, and CI runs cargo test --features blapi.
New observations (see inline comments)
pkg_config()+static=freebl: Thepkg_configcode path (system-packaged NSS) linksstatic=freebl, but distro packages typically don't shiplibfreebl.a. This will break--features blapifor consumers not building NSS from source.blapi+disable-encryption: Enabling both features silently dropsblapiin favour of the null AEAD. Acompile_error!would make this explicit rather than a silent no-op.- CI powerset scope:
cargo hack --feature-powersetruns on every matrix entry; gating on one toolchain would cut CI time without losing coverage.
Not flagged (verified correct)
- AES-GCM uses separate encrypt/decrypt contexts; ChaCha20 correctly shares one (stateless per-message nonce).
- Key derivation goes through
SSL_HkdfExpandLabelWithMech(same function the PKCS#11 path uses internally), so the derived key material is identical — only the AEAD context differs. - In-place encrypt/decrypt pointer arithmetic is within bounds (verified against the pre-checks).
- Thread safety: blapi
RecordProtectionis!Send + !Syncviascoped_ptr!, matching the non-blapi variant.
martinthomson
left a comment
There was a problem hiding this comment.
A few preliminary comments.
There was a problem hiding this comment.
Summary
Clean, well-scoped performance optimization. The approach is sound: an opt-in feature flag that replaces the PKCS#11 session layer (PK11_AEADOp → sftk_SessionFromHandle + mutex) with direct freebl calls for the three TLS 1.3 cipher suites. The FFI signatures match NSS's blapi.h, the scoped_ptr! wrappers handle context lifetimes correctly (and keep RecordProtection !Send + !Sync), and key derivation goes through the same SSL_HkdfExpandLabelWithMech as the PKCS#11 path, ensuring identical key material.
The xor_nonce extraction is a good DRY improvement shared across all paths.
Prior feedback addressed
Build.rs linkage is now feature-gated, Cargo.toml documents the FIPS implication, the CK_GCM_MESSAGE_PARAMS size assertion is in place, encrypt uses the single-as_mut_ptr derivation pattern, and CI runs cargo test --features blapi. The integration test in tests/aead.rs exercises RecordProtection with known TLS vectors (including expected ciphertext bytes), which validates correctness of the blapi path — a key concern from earlier reviews.
Remaining items (see inline comments)
blapi+disable-encryptioninteraction: silently dropsblapiwith no compile-time signal. Acompile_error!would make this explicit.pkg_config()+static=freebl: system-packaged NSS typically doesn't shiplibfreebl.a; this will break--features blapifor consumers not building NSS from source.- Code duplication: as @martinthomson noted, the blapi and non-blapi
RecordProtectionmodules share significant structure (bounds checks, nonce construction, method signatures). Refactoring to share the outer method logic — differing only in the crypto dispatch — would reduce the maintenance surface. Context specialization (single-direction AES context) would be a natural part of that refactoring. - CI powerset scope:
cargo hack --feature-powerseton a single toolchain would cut CI time without losing coverage.
74e15cf to
3a1fe94
Compare
There was a problem hiding this comment.
Summary
Sound, well-scoped optimization. Bypassing the PKCS#11 session layer (PK11_AEADOp → sftk_SessionFromHandle + mutex) for an opt-in hot path is a clean approach. The FFI signatures match NSS's blapi.h, unsafe blocks have correct bounds checks and documented safety contracts, and the scoped_ptr! wrappers keep RecordProtection correctly !Send + !Sync. The xor_nonce extraction is a nice DRY improvement shared across all paths.
Key derivation correctly goes through SSL_HkdfExpandLabelWithMech (same function the PKCS#11 path uses internally), so the derived key material is identical — only the AEAD dispatch differs. The compile_error! for blapi + disable-encryption, feature-gated linkage in build.rs (static loader for standalone, dylib=freebl3 for pkg-config), and cargo hack --mutually-exclusive-features are all well done.
Remaining items
-
Code sharing (see inline comments): @martinthomson's structural feedback about sharing code between the blapi and non-blapi modules and direction specialization is still open and would reduce the maintenance surface significantly. The two
RecordProtectionimplementations share bounds checks, nonce construction, method signatures, and error handling — differing only in the crypto dispatch. -
Test coverage gap (see inline comment on
tests/aead.rs): The integration tests only exercise AES-128-GCM. The ChaCha20-Poly1305 branch inaead_op(different FFI function, different parameter layout — notablytagOut/tagInas a separate argument vs. embedded inCK_GCM_MESSAGE_PARAMS) is completely untested through the blapi path. -
API contract change: The blapi
RecordProtectionis uni-directional (runtime error on direction mismatch), unlike the non-blapi variant. This is fine for neqo's usage but should be documented since the public type signature is unchanged.
Verified correct (not flagged)
- In-place encrypt/decrypt pointer arithmetic is within bounds (checked against the pre-conditions).
- Tag pointer derivation uses a single
as_mut_ptr()+ offset, avoiding aliasing issues. AES_CreateContextcopies the key material, so the temporarykey_data()buffer can safely drop.- The
CK_GCM_MESSAGE_PARAMSsize assertion correctly skips Windows (LLP64 mixed pointer/ulong widths);#[repr(C)]handles the layout on all platforms.
There was a problem hiding this comment.
Summary
Sound, well-scoped optimization. The approach — an opt-in feature flag that replaces the PKCS#11 session layer with direct freebl calls for the three TLS 1.3 AEAD cipher suites — is clean and correctly implemented. The FFI signatures match NSS's blapi.h, the unsafe blocks have accurate safety contracts and correct bounds checks, and the scoped_ptr! wrappers keep RecordProtection correctly !Send + !Sync.
Prior review feedback has been addressed well: build.rs linkage is feature-gated (with the correct static=freebl vs dylib=freebl3 split between standalone and pkg-config paths), compile_error! guards against blapi + disable-encryption, CI runs cargo test --features blapi, and the cargo hack --mutually-exclusive-features flag is in place. The known-ciphertext test in aead_encrypt_decrypt implicitly validates the blapi AES-128-GCM path when run under --features blapi, and the new roundtrip_aes256/roundtrip_chacha20 tests cover the remaining cipher suites.
Verified correct (not flagged)
- In-place encrypt/decrypt pointer arithmetic is within bounds (verified against the pre-condition checks). Tag pointer derivation uses the single
as_mut_ptr()+ offset pattern, avoiding aliasing issues. AES_CreateContextcopies the key material, so the temporarykey_data()buffer can safely drop without invalidating the context.- AES-GCM correctly uses separate encrypt/decrypt contexts; ChaCha20 correctly shares one (stateless per-message nonce).
- Key derivation goes through
SSL_HkdfExpandLabelWithMech(same as the PKCS#11 path), ensuring identical key material. - The
CK_GCM_MESSAGE_PARAMSsize assertion correctly skips Windows (LLP64 mixed pointer/ulong widths);#[repr(C)]handles the layout on all platforms. - The
xor_nonceextraction is a clean DRY improvement;Aead::make_nonce's refactored semantics are equivalent to the original in-place XOR.
Remaining design concerns
- Code sharing: @martinthomson's feedback on reducing duplication between the blapi and non-blapi
RecordProtectionmodules is the main open structural concern. See inline comment for a concrete suggestion. - Direction specialization: @martinthomson's suggestion to avoid constructing the unused AES context (since callers are exclusively encrypt or decrypt) would reduce memory and construction cost.
- Key material lifecycle: See inline comment on
src/aead.rs:474— the blapi path exposes raw key bytes in Rust heap memory that isn't zeroed on drop, unlike the PKCS#11 path.
41effad to
167920c
Compare
Calls freebl AES-GCM and ChaCha20-Poly1305 primitives directly instead of going through `PK11_AEADOp` → softoken, eliminating the `sftk_SessionFromHandle` mutex and hash-table overhead (~7.6% CPU on the hot path). `RecordProtection::new` takes a `Mode` so callers that know their direction (send vs. receive) create only the context they need. NOTE: bypasses softoken's FIPS self-test gate. Intentional for neqo (non-FIPS).
Add
--features blapiwhich replacesRecordProtection's hot path fromPK11_AEADOp(PKCS#11 → softoken → freebl) to direct freebl calls (AES_AEAD,ChaCha20Poly1305_Encrypt/Decrypt), eliminating the substantialsftk_SessionFromHandlemutex and hash-table overheads.Benchmark: mozilla/neqo#3601 (comment)