perf: cut verifier memory — gate storage_logs trace (Boojum-safe) + consolidate WorldDiff maps#109
Draft
0xVolosnikov wants to merge 8 commits into
Draft
perf: cut verifier memory — gate storage_logs trace (Boojum-safe) + consolidate WorldDiff maps#1090xVolosnikov wants to merge 8 commits into
0xVolosnikov wants to merge 8 commits into
Conversation
In zkVM verifier guests where the in-guest heap is tight (768 MiB on
the eravm-airbender-verifier corpus), `WorldDiff::storage_logs` grew
to ~220 MiB on real-world batches and `rollback_storage_logs` added
another ~50 MiB — the largest single-Vec contribution to guest peak
memory. The accumulated trace is consumed only by
`circuit_sequencer_api::sort_storage_access_queries` to derive a
per-slot summary, and that summary can be derived directly from the
existing rollback-aware maps without the per-access trace.
This PR:
* Stops pushing per-access entries to `storage_logs` and
`rollback_storage_logs` in `read_storage_inner` and `write_storage`.
* Caches initial values on first read in `storage_initial_values`
(previously only writes populated it; reads went through
`just_read_storage` which doesn't cache). This is needed because
downstream summarizers can no longer recover the initial value from
the storage_logs trace.
* Adds `WorldDiff::committed_reads_at_depth_zero` — a
`RollbackableSet<(H160, U256)>` that materializes the dedup
function's `did_read_at_depth_zero` predicate incrementally:
a slot is added by `read_storage_inner` iff `storage_changes`
doesn't contain it at the time of read (i.e. no pending write for
that slot). Rolled back together with the other "committed"
trackers in `external_rollback`; not rolled back by internal
`rollback` (matches the storage_logs behavior the dedup observed).
* Public accessors:
- `WorldDiff::reserve_storage_log_capacity` / `reserve_auxiliary_log_capacity`
— reserve the inner Vecs from witness counts (avoids the
doubling-realloc transients that double peak memory).
- `WorldDiff::committed_reads_at_depth_zero_iter`
- `WorldDiff::initial_storage_value(contract, key) -> Option<StorageSlot>`
- `WorldDiff::read_storage_slots_iter`
- `Heaps::reserve_dynamic_groups` + `VirtualMachine::reserve_dynamic_heap_capacity`
- `RollbackableLog::reserve`
* Together with the consumer changes in
matter-labs/eravm-airbender-verifier#18 and the
zksync-protocol PR (linked from there), per-batch guest peak drops
from 1.16 GiB to ~700 MiB.
## Breaking changes (intentional, want feedback)
`storage_log_queries()` and `storage_logs_after()` now return empty
slices in steady state. The only in-repo consumer
(`circuit_sequencer_api::sort_storage_access_queries` via `vm_fast`
and `vm_latest`) is rewired in the linked PRs. Externally, anyone
relying on the per-access trace for witness generation will be
affected.
If we want to land this without breaking existing users, the
`storage_logs` accumulation should be gated by a `Settings` flag
(opt-out) or by a constructor variant. Happy to add that based on
review feedback.
## Status
Draft, posted for discussion. Functional on the eravm-airbender-verifier
corpus end-to-end through `verify()` — output matches the original
`sort_storage_access_queries` count exactly (10729 entries on batch
67901, 8817 on batch 67911) thanks to the
`committed_reads_at_depth_zero` predicate matching the dedup's
`did_read_at_depth_zero` semantics.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prior commit removed the per-access `storage_logs` / `rollback_storage_logs` trace unconditionally to save ~270 MiB in the Airbender re-execution verifier. That breaks any consumer that builds an in-circuit storage argument from `storage_log_queries()` (Boojum witness generation via `sort_storage_access_queries`), and it left 5 storage-log tests failing. Make recording configurable instead: - Add `WorldDiff::set_record_storage_logs(record)` and a `skip_storage_logs` flag (default `false` = recording ON, preserving the pre-existing Boojum behavior). `read_storage_inner` / `write_storage` gate the trace pushes on it. - Re-execution verifiers with no in-circuit storage argument (Airbender) call `set_record_storage_logs(false)` to derive the deduplicated set from `committed_reads_at_depth_zero` + `storage_changes` and drop the trace cost. All 54 lib tests pass, including the storage-log trace tests (restored under the default record mode) plus a new test locking the opt-out path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Doc-only + formatting, no behavior change (54 lib tests pass): - add missing backticks around code identifiers; reflow a doc line so a leading '+' isn't parsed as a markdown list bullet (clippy doc lints) - rustfmt: collapse a method-chain and wrap a long test assertion Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…#105) Reduces `WorldDiff` memory on large mainnet batches (consumer: matter-labs/eravm-airbender-verifier#18) by removing duplicated `(address, key)` keys across its maps. **Stacked on #104** (`vv/memopt-on-popzxc`). > **Scope note:** this PR now contains **both** consolidation steps — the experimental "Group A" (#106) was merged into this branch, so it's no longer separate. Both are described below. ## Changes - **Group B — merge the three membership sets** (`read_storage_slots`, `written_storage_slots`, `committed_reads_at_depth_zero`) into one `slot_flags: RollbackableMap<(H160,U256), u8>` of bit flags. External-rollback semantics preserved; public `committed_reads_at_depth_zero_iter` kept (now filters the flag). - **Group A — merge the two internal-rollback write maps** (`storage_changes: U256` + `paid_changes: u32`) into one `storage_writes: RollbackableMap<(H160,U256), StorageWriteEntry { value, paid }>`. `transient_storage_changes` left separate (distinct keyspace). ##⚠️ Public API change `WorldDiff::get_storage_state()` now returns `&BTreeMap<_, StorageWriteEntry>` (was `…, U256>`), and `StorageWriteEntry` is re-exported from the crate root. **Direct callers must project `.value`.** The downstream consumer is eravm-airbender-verifier's `vm_fast` (2 sites) — its vm2 pin bump must land together with that `.value` projection. ## Correctness - Rollback groups unchanged (`slot_flags` external; `storage_writes` internal). - `write_storage` does a single insert per path; `prepaid` reuses the prior entry value (no redundant lookup/history). - **55 lib tests pass**, including the `storage_changes_*` proptests, the Boojum storage-log trace tests, and a new `merged_storage_write_tracks_paid_and_rolls_back` covering non-zero `prepaid` + rollback. ## Measured impact On the eravm guest, the worst-case production batch (67912) drops from needing ~920–952 MiB to **fitting at ~720 MiB** (~200 MiB off peak) — turning a <32 MiB margin into comfortable headroom. ## Review items addressed P2 (re-export `StorageWriteEntry`), P3 (single insert in `write_storage`; add non-zero-paid test), and stale doc-comment field-name refs — all in `1044b47`. P1 (scope) addressed by this description; the API break is called out above for the coordinated eravm change. --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
P2: gate the read-only storage_initial_values cache and the SLOT_COMMITTED_READ_Z0 predicate behind opt-out mode. In recording (Boojum) mode read_storage_inner now reads via just_read_storage exactly like the pre-optimization base, so no per-read map entries are added and memory behavior is unchanged. P3: assert no storage access has happened when set_record_storage_logs is called, so a mid-run toggle panics instead of silently producing a partial trace / dedup state. Add regression tests for both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Dedup the skip_storage_logs field doc against the set_record_storage_logs method doc, and tighten the read_storage_inner branch comments. No code change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
slot_add_flag did a get + conditional insert (two BTreeMap traversals, plus a key clone and journal push on change). Add RollbackableMap::add_flags: an entry-based OR-merge that traverses once and journals only when a bit actually changes. Rollback semantics are identical (journals (key, Some(old)) on an existing entry, (key, None) on a fresh one; nothing when unchanged). Recovers most of the ~0.24% cycle overhead the map consolidation added on the verifier's storage-heavy path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Merging value+paid into one storage_writes entry made write_storage read the prior entry before rewriting it (a separate get on every write, doubling the storage-map ops on the free-storage path). Remove it: - non-free writes take `prepaid` from RollbackableMap::insert's returned old value instead of a standalone lookup; - free writes use a new single-traversal RollbackableMap::update that journals the prior value and recomputes the entry in place. Behavior and rollback journaling are identical; one fewer BTreeMap traversal per write. Targets the ~230M-cycle Group-A overhead measured on batch 67912. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
0d94de1 to
4270ec4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three independent optimizations to reduce WorldDiff verifier memory:
storage_logsrecording (dual-mode) — conditional per-access tracerecording, Boojum-memory-identical, opt-out saves ~270 MiB on large batches.
(address, key)maps — merge duplicate keyedcollections into unified entries with bit flags.
vector-doubling transients.
Impact
.valuefrom the newStorageWriteEntrytype.Changes
Net diff vs
master: 7 source files (world_diff.rs,rollback.rs,heap.rs,vm.rs,tracing.rs,lib.rs,single_instruction_test/heap.rs).Cargo.lockunchanged; no dependency changes.