Skip to content

trie-db 0.31.0 causes huge benchmark regression for storage-heavy extrinsics #230

@sigurpol

Description

@sigurpol

The problem

We have observed a huge regression in benchmark execution time once bumping from trie-db 0.30.0 to 0.31.0 from benchmark whose setup does massive storage operations.

This is critical because we can't confidently benchmark some pallets for the upcoming 2.1.0 Polkadot / Kusama release.

A concrete example

Let's take a simple dummy extrinsic as example:

force_apply_min_commission (here) is a simple extrinsic doing 2read and 1 write so having ms as execution time doesn't make sense.

What is peculiar in the specific benchmark (and in many in staking-async pallet in the SDK) is that before benchmarking, we do massive storage deletion (e.g. on PolkadotAH we delete 27k nominators/ validators and all related storage items - so a lot). This happens in the benchmark setup phase and shouldn't therefore leak into benchmark results (see Notes at the end for a recent fix/workaround in frame-omni-bencher).

force_apply_min_commission in pallet_staking_async takes ~46µs with trie-db 0.30.0 but ~2.8ms with trie-db 0.31.0 while testing on my Ubuntu desktop. Results are aligned with what we see while running benchmark on fellowship CI (QEMU/native runner): look at the difference in results between a frame-omni-bencher built with 0.30.0 and one with 0.31.0 here -> polkadot-fellows/runtimes#1065 (comment)

The regression affects any benchmark whose setup (so before actually measuring the extrinsic) populates/deletes a large trie (e.g. > 25k validator/nominator entries) — the overhead seems to come from commit_db() / trie backend operations, not in the benchmarked extrinsic itself.

Bisection summary

(results coming from my local desktop, we observed the same on CI)

  • frame-omni-bencher v0.15.0 / v0.16.0 (trie-db 0.30.0) → ~46 µs
  • frame-omni-bencher v0.17.0+ (trie-db 0.31.0, PR bump trie-db version to 0.31.0 polkadot-sdk#10573) → ~3 ms
  • Same runtime WASM, different frame-omni-bencher binary — confirms it's an issue with the binary

Root cause: trie-db bump 0.30.0 → 0.31.0 (#226 #226) merged in polkadot-sdk#10573 (2025-12-11) paritytech/polkadot-sdk#10573

The relationship between frame-omni-bencher and trie-db

We are using frame-omni-bencher to benchmark pallet extrinsics. For staking-async pallet many benchmark setup implies bulk storage operation like the deletion of thousands of items from storage (e.g. 26k nominators/validators created at genesis).
Debatable or not, this is not the point now.

trie-db is not a direct dependency of frame-omni-bencher
AFAIK the chain is the following:

  frame-omni-bencher                                                            
    → frame-benchmarking-cli                                                
      → sc-executor (runs WASM benchmarks)                                      
        → sp-state-machine (storage overlay + trie backend)                 
          → sp-trie                                                             
            → trie-db 

During benchmarking, the host creates an in-memory trie backend (via sp-state-machine) to hold the genesis state.
Every storage read/write during benchmark setup (like clear_validators_and_nominators() inserting/removing dozen thousands of entries) and commit_db() goes through sp-trie → trie-db on the host side.

The regression is in this host-side trie layer — the trie operations that materialize the storage overlay into the backend before the benchmark timing starts.
The benchmarked extrinsic itself is fast (1 read - 2 write in my example) ; it's the trie commit of the massive genesis state that got much slower with trie-db 0.31.0.
This shouldn't leak into benchmark results in any way - see paritytech/polkadot-sdk#10798 and in particular this comment from @cheme about writing I/O operation still on going after commit returns early.

Reproduction steps

Checkout latest main from https://github.com/polkadot-fellows/runtimes.
Then from there:

# build polkadot asset hub
cargo build -p asset-hub-polkadot-runtime --profile production --features runtime-benchmarks
# install latest and greatest frame-omni-bencher
cargo install frame-omni-bencher
# test with frame-omni-bencher  the specific extrinsic 
frame-omni-bencher v1 benchmark pallet --runtime ./target/release/wbuild/asset-hub-polkadot-runtime/asset_hub_polkadot_runtime.compact.compressed.wasm --pallet pallet_staking_async --extrinsic "force_apply_min_commission" --steps 2 --repeat 1
## You will see execution time for this extrinsics around few ms

# Now install frame-omni-bencher v0.16.0 => the latest with trie-db 0.30.0
cargo install frame-omni-bencher --version 0.16.0 // if your clang/gcc are very recent, prefix with CXXFLAGS="-include cstdint"
# test with frame-omni-bencher 
frame-omni-bencher v1 benchmark pallet --runtime ./target/release/wbuild/asset-hub-polkadot-runtime/asset_hub_polkadot_runtime.compact.compressed.wasm --pallet pallet_staking_async --extrinsic "force_apply_min_commission" --steps 2 --repeat 1
## You will see execution time for this extrinsics around few microseconds!

Notes

The issue happens on Polkadot and Kusama AssetHub in the fellowship runtime repo. Interestingly enough, when running the same benchmark on Westend AssetHub within SDK using the same frame-omni-bencher, the results is in the µs ball-park and not ms.
To be noted that on Westend AssetHub (see paritytech/polkadot-sdk#10798), we have observed initially execution time for this benchmark (massive storage deletion in setup + 1 read - 2 write in execution) around fw ms but that was "fixed" doing a dummy DB read/write in SDK PR paritytech/polkadot-sdk#10802 and then (better) paritytech/polkadot-sdk#10974 - this fix has brought down execution time to µs on Westend AssetHub (Where we also do bulk deletion before executing the extrinsic in the benchmark) - but not on Polkadot / Kusama AssetHub.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions