Skip to content

fix(trie): close prune race against concurrent commits via augment-walk#711

Merged
github-actions[bot] merged 1 commit into
mainfrom
fix/trie-prune-race-augment-walk
May 25, 2026
Merged

fix(trie): close prune race against concurrent commits via augment-walk#711
github-actions[bot] merged 1 commit into
mainfrom
fix/trie-prune-race-augment-walk

Conversation

@satyakwok
Copy link
Copy Markdown
Collaborator

@satyakwok satyakwok commented May 25, 2026

maybe_prune_trie in sentrix-core spawns the prune on a background std::thread that clones the trie. The clone's self.version and self.root are frozen at clone time. The live-set walk in prune() then iterates roots from cutoff+1..=self.version (the snapshot view).

While that walk runs — seconds on a healthy chain, minutes on a large one — the main thread keeps applying blocks. Each commit goes through flush_trie_batch and lands new entries in TABLE_TRIE_VALUES and TABLE_TRIE_NODES. Their hashes are NOT in the prune's live set because the walk built it from the snapshot.

When gc_orphaned_nodes runs at the end of prune(), it deletes every key in those two tables that isn't in live. That includes the just-committed entries from the racing block. Next boot, verify_integrity reports "orphan value reference …" or "orphan node reference …" depending on which table got hit — exactly what mainnet saw on 2026-05-24:

  • vps3 core: orphan-value 516a0dc27c… at h=2215224 (chain.db hit the race within milliseconds of finalising that block)
  • fullnode-1: orphan-node 516a0dc27c… at h=2131563
  • fullnode-2: orphan-node 28fdc024ed1d… at h=2125714

Different roles, different heights, same race class.

The fix: right before gc_orphaned_nodes, reload the highest version currently in TABLE_TRIE_ROOTS and walk every root committed AFTER self.version. Hashes reachable from those new roots get pinned in live before the delete runs. The race window shrinks from "duration of the full live-set walk" to "duration between the final root reload and the gc commit" — at most one block's worth of writes.

A truly race-free fix requires running the live walk AND the delete inside the same MDBX RW transaction. That's a bigger refactor — exposing the txn handle through TrieStorage is a surface-area change that belongs in its own PR. The B3b total_minted self-heal (separate PR) and the boot-time verify_integrity orphan check together handle the narrow remaining window as defense in depth.

Adds TrieStorage::latest_version() — one cursor walk over TABLE_TRIE_ROOTS (bounded by TRIE_KEEP_VERSIONS, ~1000 entries), milliseconds in practice.

Regression test test_prune_augment_walk_preserves_newer_versions_committed_during_race:

  • v=1..=3 committed
  • Open prune_view at v=3 (snapshot)
  • Main commits v=4 with a brand-new key whose value_hash is unique to v=4
  • prune_view.prune(0) runs
  • Assert the v=4 value is still in TABLE_TRIE_VALUES (pre-fix: gone)
  • Assert main.get(race_key) still resolves (pre-fix: trie integrity error)

Summary

Risk tier

Check ONE:

  • 🟢 Low — docs, tools/, tests/, CI configs, dependency patch bumps in dev-only crates, comments
  • 🟡 Medium — non-consensus production code (RPC handlers, network plumbing, observability, ops scripts)
  • 🟠 High — consensus-critical crates (sentrix-core, sentrix-trie, sentrix-staking, sentrix-bft), block_executor, apply_block_*, state_root path
  • 🔴 Critical — Voyager activation, fork-height changes, hard-fork rollouts, anything that flips env vars on mainnet

Required by tier

🟢 Low — minimum bar

  • CI green (tests + clippy + audit + gitleaks)

🟡 Medium — adds

  • New public function or behavioural change has at least one corresponding #[test] in same PR
  • Brief description of how this was tested (manual run, integration test, etc.)

🟠 High — adds

  • Regression test that fails on main and passes with this change — paste test name in PR body
  • Designed against documented invariant (link the audit/runbook/design doc)
  • Fresh-brain review by someone other than the author (per the consensus-change review checklist)
  • Single conceptual unit per PR (no bundling — bundling consensus changes burned us on v2.1.12 → 2026-04-25 livelock)

🔴 Critical — adds

  • Testnet rehearsal completed with success criteria + log evidence linked here
  • Bake window observed: minimum 2h on testnet at the same configuration before mainnet
  • Coordinated rollback plan documented in PR body — exact commands operator runs if it fails
  • Operator sign-off at activation moment (not just PR approval — separate moment for the actual flip)

Test plan

Rollback plan

Related

Summary by CodeRabbit

  • Bug Fixes

    • Improved data storage reliability during concurrent operations.
  • Tests

    • Added regression test for concurrent data access scenarios.

Review Change Stack

`maybe_prune_trie` in sentrix-core spawns the prune on a background
`std::thread` that clones the trie. The clone's `self.version` and
`self.root` are frozen at clone time. The live-set walk in `prune()`
then iterates roots from `cutoff+1..=self.version` (the snapshot view).

While that walk runs — seconds on a healthy chain, minutes on a large
one — the main thread keeps applying blocks. Each commit goes through
`flush_trie_batch` and lands new entries in `TABLE_TRIE_VALUES` and
`TABLE_TRIE_NODES`. Their hashes are NOT in the prune's `live` set
because the walk built it from the snapshot.

When `gc_orphaned_nodes` runs at the end of `prune()`, it deletes every
key in those two tables that isn't in `live`. That includes the
just-committed entries from the racing block. Next boot,
`verify_integrity` reports "orphan value reference …" or "orphan node
reference …" depending on which table got hit — exactly what mainnet
saw on 2026-05-24:

  * vps3 core: orphan-value `516a0dc27c…` at h=2215224 (chain.db hit
    the race within milliseconds of finalising that block)
  * fullnode-1: orphan-node `516a0dc27c…` at h=2131563
  * fullnode-2: orphan-node `28fdc024ed1d…` at h=2125714

Different roles, different heights, same race class.

The fix: right before `gc_orphaned_nodes`, reload the highest version
currently in `TABLE_TRIE_ROOTS` and walk every root committed AFTER
`self.version`. Hashes reachable from those new roots get pinned in
`live` before the delete runs. The race window shrinks from "duration
of the full live-set walk" to "duration between the final root reload
and the gc commit" — at most one block's worth of writes.

A truly race-free fix requires running the live walk AND the delete
inside the same MDBX RW transaction. That's a bigger refactor — exposing
the txn handle through `TrieStorage` is a surface-area change that
belongs in its own PR. The B3b total_minted self-heal (separate PR) and
the boot-time `verify_integrity` orphan check together handle the
narrow remaining window as defense in depth.

Adds `TrieStorage::latest_version()` — one cursor walk over
`TABLE_TRIE_ROOTS` (bounded by `TRIE_KEEP_VERSIONS`, ~1000 entries),
milliseconds in practice.

Regression test `test_prune_augment_walk_preserves_newer_versions_committed_during_race`:
- v=1..=3 committed
- Open prune_view at v=3 (snapshot)
- Main commits v=4 with a brand-new key whose value_hash is unique to v=4
- prune_view.prune(0) runs
- Assert the v=4 value is still in TABLE_TRIE_VALUES (pre-fix: gone)
- Assert main.get(race_key) still resolves (pre-fix: trie integrity error)
@github-actions github-actions Bot enabled auto-merge (squash) May 25, 2026 11:25
@satyakwok satyakwok self-assigned this May 25, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 25, 2026

Codecov Report

❌ Patch coverage is 96.82540% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/sentrix-trie/src/storage.rs 93.33% 1 Missing ⚠️
crates/sentrix-trie/src/tree.rs 97.91% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 51a2ec23-3da1-41b7-af32-071dea073fee

📥 Commits

Reviewing files that changed from the base of the PR and between 23d6df8 and 755f3de.

📒 Files selected for processing (2)
  • crates/sentrix-trie/src/storage.rs
  • crates/sentrix-trie/src/tree.rs

📝 Walkthrough

Walkthrough

This PR fixes a concurrency bug in trie garbage collection by adding a new storage utility and enhancing the prune logic. A public TrieStorage::latest_version() method is introduced to scan committed trie root versions from disk. The SentrixTrie::prune() method is then augmented to use this utility to detect roots committed after pruning begins (the race window) and include them in the live-set walk, preventing the garbage collector from deleting nodes that were committed concurrently. A regression test validates that values committed during the race remain after pruning completes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • sentrix-labs/sentrix#583: Refactors prune_old_roots and GC to stream table scans and use latest_version for prune decisions; this PR's new storage helper and race-handling logic complement that refactor.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description provides detailed context about the race condition and fix, but the formal checklist sections (Summary, Risk tier, Test plan, Rollback plan) are incomplete or missing. Complete the required template sections: fill in the Summary field (1-3 lines), check the Risk tier checkbox, and complete the Test plan with bullet points.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: fixing a race condition in trie pruning via an augment-walk approach.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/trie-prune-race-augment-walk

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot merged commit 481376c into main May 25, 2026
18 checks passed
github-actions Bot pushed a commit that referenced this pull request May 25, 2026
Ships PRs #711 (trie prune race augment-walk), #712 (B3b total_minted
self-heal), #713 (pay_one_signer sorted delegators), #714
(collect_reachable depth-aware empty check). See CHANGELOG entry for
the per-PR narrative and the 2026-05-24 incident root-cause analysis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant