feat: use the nss-rs blapi feature#3601
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3601 +/- ##
==========================================
- Coverage 95.11% 94.96% -0.16%
==========================================
Files 111 116 +5
Lines 37999 38361 +362
Branches 37999 38361 +362
==========================================
+ Hits 36144 36430 +286
- Misses 1159 1225 +66
- Partials 696 706 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
|
07117c8 to
133c523
Compare
martinthomson
left a comment
There was a problem hiding this comment.
I'm sad to say that this looks really, really good. About the same sort of performance lift as the aws-lc patch you put together; perhaps even better and more consistent. So I'm going to suggest that we continue to explore this one.
Merging this PR will improve performance by 4.34%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | coalesce_acked_from_zero 1000+1 entries |
2.8 µs | 2.7 µs | +4.34% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing larseggert:nss_rs-52 (97724f9) with main (7c2a6a3)
Footnotes
-
27 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
nss-rs blapi feature
| libc = { version = "0.2", default-features = false } | ||
| log = { version = "0.4", default-features = false } | ||
| nss = { rev = "0.9.0", package = "nss-rs", git = "https://github.com/mozilla/nss-rs" } | ||
| nss = { rev = "2f5bbfe4fe3bce69b9b73f6e3d9a115e30bfaeb2", package = "nss-rs", git = "https://github.com/mozilla/nss-rs" } |
There was a problem hiding this comment.
This needs to be updated to the nss-rs release version that ships the blapi feature.
There was a problem hiding this comment.
Pull request overview
This PR updates neqo to use the newer nss-rs APIs associated with the blapi feature, including an explicit Mode (encrypt/decrypt) when constructing RecordProtection, and wires blapi through crate features (default-enabled) to pick the faster backend.
Changes:
- Update QUIC crypto and Retry packet protection to pass
nss::Modeand (for Retry) maintain separate encrypt/decrypt AEAD instances. - Add
blapias a default-enabled feature inneqo-transport, and propagate it throughneqo-qpack/neqo-http3while disabling dependency default-features. - Bump the
nss-rsgit revision and adjust CI clippy feature-matrix constraints accordingly.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test-fixture/src/header_protection.rs | Update AEAD construction to new Mode-taking API (but currently hardcodes Encrypt). |
| neqo-transport/src/packet/retry.rs | Split Retry AEAD storage by encrypt/decrypt mode; update helper to accept Mode. |
| neqo-transport/src/packet/mod.rs | Pass Mode::{Encrypt,Decrypt} to Retry tag generation/validation. |
| neqo-transport/src/crypto.rs | Thread Mode into RecordProtection::new via CryptoDxDirection -> Mode mapping; minor test helpers. |
| neqo-transport/Cargo.toml | Introduce blapi feature and enable it by default. |
| neqo-qpack/Cargo.toml | Disable neqo-transport default-features; add blapi feature/default passthrough. |
| neqo-http3/Cargo.toml | Disable neqo-{qpack,transport} default-features; add blapi feature/default passthrough. |
| neqo-bin/src/client/mod.rs | Box-pin one Runner future to address large-future concerns. |
| neqo-bin/src/bin/client.rs | Remove Apple-only clippy suppression attribute. |
| Cargo.toml | Update nss-rs dependency to a specific git revision. |
| Cargo.lock | Lockfile update for the new nss-rs revision. |
| .github/workflows/clippy.yml | Exclude mutually-incompatible feature combinations involving blapi. |
Comments suppressed due to low confidence (3)
neqo-transport/Cargo.toml:49
- Setting
default = ["blapi"]makesblapienabled implicitly even when consumers enablebuild-fuzzing-corpus. Butbuild-fuzzing-corpusenablesnss/disable-encryption(and CI explicitly treatsblapias mutually exclusive with bothdisable-encryptionandbuild-fuzzing-corpus). This means commands likecargo test --features build-fuzzing-corpus(seetest/make-fuzz-corpus.sh) will now require--no-default-featuresto avoid a feature conflict. Consider not makingblapia default feature, or update the fuzzing-corpus build invocation/docs to disable default features when enablingbuild-fuzzing-corpus.
[features]
bench = ["neqo-common/bench", "nss/bench", "test-fixture/bench", "log/release_max_level_info"]
blapi = ["nss/blapi"]
default = ["blapi"]
build-fuzzing-corpus = [
"neqo-common/build-fuzzing-corpus",
"nss/disable-encryption",
"nss/disable-random",
"test-fixture/disable-random",
]
neqo-qpack/Cargo.toml:41
- With
default = ["blapi"], enabling this crate'sbuild-fuzzing-corpusfeature will also enableblapiunless the build uses--no-default-features. Givenbuild-fuzzing-corpuspulls inneqo-transport/build-fuzzing-corpus(which enablesnss/disable-encryption) and CI marksblapias mutually exclusive with fuzzing/disable-encryption, this creates an easy-to-hit feature conflict. Consider documenting/adjusting fuzz builds to pass--no-default-features, or avoid makingblapia default feature here.
[features]
bench = ["neqo-common/bench", "neqo-transport/bench", "log/release_max_level_info"]
blapi = ["neqo-transport/blapi"]
default = ["blapi"]
build-fuzzing-corpus = [
"neqo-common/build-fuzzing-corpus",
"neqo-transport/build-fuzzing-corpus",
"test-fixture/disable-random",
]
neqo-http3/Cargo.toml:56
- With
default = ["blapi"], enablingbuild-fuzzing-corpuswill also keepblapienabled unless builds pass--no-default-features. Sincebuild-fuzzing-corpusenablesneqo-transport/disable-encryptionandnss/disable-encryption(and CI treatsblapias mutually exclusive with these), this risks a feature conflict for fuzz-corpus builds. Consider documenting/adjusting those builds to disable default features, or avoid makingblapia default feature.
[features]
bench = [
"neqo-common/bench",
"neqo-qpack/bench",
"neqo-transport/bench",
"test-fixture/bench",
"log/release_max_level_info",
]
blapi = ["neqo-transport/blapi"]
default = ["blapi"]
build-fuzzing-corpus = [
"neqo-common/build-fuzzing-corpus",
"neqo-transport/disable-encryption",
"nss/disable-encryption",
"nss/disable-random",
"test-fixture/disable-random",
]
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f6159c0 to
222c06b
Compare
Benchmark resultsSignificant performance differences relative to 7c2a6a3. transfer/1-conn/1-100mb-resp (aka. Download): 💚 Performance has improved by -5.7657%. time: [173.48 ms 173.89 ms 174.31 ms]
thrpt: [573.68 MiB/s 575.09 MiB/s 576.45 MiB/s]
change:
time: [-6.1095% -5.7657% -5.4294] (p = 0.00 < 0.05)
thrpt: [+5.7411% +6.1184% +6.5071]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildtransfer/1-conn/1-100mb-req (aka. Upload): 💚 Performance has improved by -5.2350%. time: [178.11 ms 178.45 ms 178.78 ms]
thrpt: [559.33 MiB/s 560.39 MiB/s 561.44 MiB/s]
change:
time: [-5.4956% -5.2350% -4.9796] (p = 0.00 < 0.05)
thrpt: [+5.2405% +5.5242% +5.8151]
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severestreams/walltime/1-streams/each-1000-bytes: 💚 Performance has improved by -4.0715%. time: [574.07 µs 576.15 µs 578.65 µs]
change: [-4.9046% -4.0715% -3.3366] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
6 (6.00%) high mild
8 (8.00%) high severestreams/walltime/1000-streams/each-1000-bytes: 💚 Performance has improved by -2.2922%. time: [41.212 ms 41.294 ms 41.393 ms]
change: [-2.5331% -2.2922% -2.0452] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severestreams-flow-controlled/walltime/1-streams/each-4194304-bytes: 💚 Performance has improved by -6.1914%. time: [31.292 ms 31.341 ms 31.390 ms]
change: [-6.6437% -6.1914% -5.8697] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildstreams-flow-controlled/walltime/10-streams/each-1048576-bytes: 💚 Performance has improved by -5.3622%. time: [90.702 ms 92.004 ms 93.330 ms]
change: [-7.1810% -5.3622% -3.3430] (p = 0.00 < 0.05)
Performance has improved.transfer/walltime/pacing-false/varying-seeds: 💚 Performance has improved by -6.1346%. time: [20.984 ms 21.013 ms 21.058 ms]
change: [-6.2910% -6.1346% -5.9370] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/varying-seeds: 💚 Performance has improved by -7.4731%. time: [21.083 ms 21.112 ms 21.157 ms]
change: [-7.6756% -7.4731% -7.2309] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severetransfer/walltime/pacing-false/same-seed: 💚 Performance has improved by -6.9260%. time: [20.741 ms 20.764 ms 20.792 ms]
change: [-7.0513% -6.9260% -6.7991] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severetransfer/walltime/pacing-true/same-seed: 💚 Performance has improved by -7.1090%. time: [21.012 ms 21.028 ms 21.045 ms]
change: [-7.2079% -7.1090% -7.0118] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severeAll resultstransfer/1-conn/1-100mb-resp (aka. Download): 💚 Performance has improved by -5.7657%. time: [173.48 ms 173.89 ms 174.31 ms]
thrpt: [573.68 MiB/s 575.09 MiB/s 576.45 MiB/s]
change:
time: [-6.1095% -5.7657% -5.4294] (p = 0.00 < 0.05)
thrpt: [+5.7411% +6.1184% +6.5071]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildtransfer/1-conn/10_000-parallel-1b-resp (aka. RPS): Change within noise threshold. time: [276.46 ms 278.30 ms 280.17 ms]
thrpt: [35.692 Kelem/s 35.932 Kelem/s 36.172 Kelem/s]
change:
time: [-2.0387% -1.0572% -0.0646] (p = 0.04 < 0.05)
thrpt: [+0.0646% +1.0685% +2.0811]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/1-conn/1-1b-resp (aka. HPS): No change in performance detected. time: [39.421 ms 39.627 ms 39.849 ms]
thrpt: [25.095 B/s 25.235 B/s 25.367 B/s]
change:
time: [-0.5115% +0.2357% +0.9872] (p = 0.55 > 0.05)
thrpt: [-0.9775% -0.2351% +0.5141]
No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) high mild
10 (10.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload): 💚 Performance has improved by -5.2350%. time: [178.11 ms 178.45 ms 178.78 ms]
thrpt: [559.33 MiB/s 560.39 MiB/s 561.44 MiB/s]
change:
time: [-5.4956% -5.2350% -4.9796] (p = 0.00 < 0.05)
thrpt: [+5.2405% +5.5242% +5.8151]
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severestreams/walltime/1-streams/each-1000-bytes: 💚 Performance has improved by -4.0715%. time: [574.07 µs 576.15 µs 578.65 µs]
change: [-4.9046% -4.0715% -3.3366] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
6 (6.00%) high mild
8 (8.00%) high severestreams/walltime/1000-streams/each-1-bytes: Change within noise threshold. time: [12.217 ms 12.237 ms 12.258 ms]
change: [-0.6348% -0.4020% -0.1771] (p = 0.00 < 0.05)
Change within noise threshold.streams/walltime/1000-streams/each-1000-bytes: 💚 Performance has improved by -2.2922%. time: [41.212 ms 41.294 ms 41.393 ms]
change: [-2.5331% -2.2922% -2.0452] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severestreams-flow-controlled/walltime/1-streams/each-4194304-bytes: 💚 Performance has improved by -6.1914%. time: [31.292 ms 31.341 ms 31.390 ms]
change: [-6.6437% -6.1914% -5.8697] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildstreams-flow-controlled/walltime/10-streams/each-1048576-bytes: 💚 Performance has improved by -5.3622%. time: [90.702 ms 92.004 ms 93.330 ms]
change: [-7.1810% -5.3622% -3.3430] (p = 0.00 < 0.05)
Performance has improved.transfer/walltime/pacing-false/varying-seeds: 💚 Performance has improved by -6.1346%. time: [20.984 ms 21.013 ms 21.058 ms]
change: [-6.2910% -6.1346% -5.9370] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/varying-seeds: 💚 Performance has improved by -7.4731%. time: [21.083 ms 21.112 ms 21.157 ms]
change: [-7.6756% -7.4731% -7.2309] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severetransfer/walltime/pacing-false/same-seed: 💚 Performance has improved by -6.9260%. time: [20.741 ms 20.764 ms 20.792 ms]
change: [-7.0513% -6.9260% -6.7991] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severetransfer/walltime/pacing-true/same-seed: 💚 Performance has improved by -7.1090%. time: [21.012 ms 21.028 ms 21.045 ms]
change: [-7.2079% -7.1090% -7.0118] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severeDownload data for |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
|
Client/server transfer resultsPerformance differences relative to 7c2a6a3. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Table above only shows statistically significant changes. See all results below. All resultsTransfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
martinthomson
left a comment
There was a problem hiding this comment.
This is really two changes:
- Tracking the API changes in the NSS crate
- Switching to blapi
It might have been nicer to do them as two steps, but this is fine.
| cipher, | ||
| secret, | ||
| version.label_prefix(), | ||
| direction.into(), |
There was a problem hiding this comment.
| direction.into(), | |
| Mode::from(direction), |
Just a preference.
| cipher, | ||
| next_secret, | ||
| self.version.label_prefix(), | ||
| self.direction.into(), |
There was a problem hiding this comment.
| self.direction.into(), | |
| Mode::from(self.direction), |
| @@ -734,21 +757,35 @@ impl CryptoDxState { | |||
| #[cfg(not(feature = "disable-encryption"))] | |||
| #[cfg(test)] | |||
| pub(crate) fn test_default() -> Self { | |||
There was a problem hiding this comment.
Why not have this take a Read/Write arg? Or at least call it test_default_write()?
| } | ||
|
|
||
| #[must_use] | ||
| #[cfg(feature = "disable-encryption")] |
There was a problem hiding this comment.
Any way we can get the disable-encryption variant of this to be const? This is awkward.
| ) | ||
| .expect("can create AEAD") | ||
| } | ||
| #[cfg(feature = "draft-29")] |
There was a problem hiding this comment.
When do you think we can get rid of draft-29? It's been a little while now.
Once mozilla/nss-rs#52 lands. This is significantly faster than the alternative in #3600, so I would prefer to use this for neqo.