Skip to content

nuke validation#27

Open
ismaelsadeeq wants to merge 6 commits into
2140-dev:masterfrom
ismaelsadeeq:05-2026-slim-out-validations
Open

nuke validation#27
ismaelsadeeq wants to merge 6 commits into
2140-dev:masterfrom
ismaelsadeeq:05-2026-slim-out-validations

Conversation

@ismaelsadeeq
Copy link
Copy Markdown
Member

validation.cpp isn't a misnomer; it's a validation file that includes block validation, mempool validation, chainstate management, consensus helpers, and friends well over 5,000 lines across multiple concerns.

This makes it hard to navigate, and hard to reason about.

This PR breaks it apart by moving code to where it actually belongs.

  • Mempool acceptance logic moves to mempool_validation.{h,cpp}
  • Block validation to block_validation.{h,cpp}
  • CScriptCheck to script/script_check.{h,cpp}
  • Transaction consensus helpers to consensus/tx_verify.{h, cpp}
  • Coins helpers to coins.{h, cpp}.
  • Block import to node/blockimport.{h, cpp}

Once the extraction is complete, what remains in validation.{cpp,h} is exclusively Chainstate and ChainstateManager, so the files are renamed to chainstate.{h,cpp} to reflect what they actually contain. node/chainstate.{cpp,h} is renamed to node/chainstate_load.{cpp,h} beforehand, also for clarity.

The circular dependencies here actually makes the underlying tight coupling more obvious.

  • block_validation -> chainstate -> block_validation — because block_validation.cpp functions take Chainstate& and ChainstateManager& parameters, pulling in
    chainstate.h, while chainstate.h needs types declared in block_validation.h.
  • chainstate -> mempool_validation -> chainstate — for the same reason on the mempool side.

The circular dependencies are added to the lint allowlist as temporary known exceptions.

The path to resolving these is to reduce what the extracted functions need to take as parameters. Rather than passing a full ChainstateManager&, functions in block_validation and mempool_validation can be refactored to take narrower dependencies, for example, passing a BlockManager& directly, or introducing a small context struct which will be defined in a separate file, so they no longer need to include chainstate.h at all.
That work is left as a follow-up to keep this PR a pure move.

Pros of the end goal of this are:

  • Faster incremental builds e.g a change to mempool logic no longer forces recompilation of everything that only cares about block validation
  • Separation of Concern and Easier navigation, the filename tells you where to look, and that file aims to do just that, hence a clearer dependency graph — #include <block_validation.h> communicates intent; #include <validation.h> communicated nothing
  • Aiming for better kernel boundary

On this branch, there are cons because this is an intermediate state.

  • The circular dependencies (block_validation -> chainstate -> block_validation, chainstate -> mempool_validation -> chainstate) mean the physical
    separation is done but the logical separation is not the full benefit requires significant follow-up work
  • Large include churn across the codebase makes the PR mechanically large and harder to review I think

ImportBlocks, which loads block files from disk during startup (-reindex
and -loadblock), does not belong in blockstorage.cpp alongside the
BlockManager class it merely uses. Move it to node/blockimport.{cpp,h}
so that each file has a single clear responsibility.

This is a pure move: no logic is changed. blockimport.cpp is given only
the includes it directly requires; util/result.h is removed from
blockstorage.cpp where it became unused once ImportBlocks left.
CheckFinalTxAtTip is a pure transaction finality check with no
dependency on chainstate. Move it to consensus/tx_verify.{cpp,h}
where it sits alongside the other transaction verification helpers.

DisconnectResult, ApplyTxInUndo, and UpdateCoins are UTXO mutation
primitives called during block (dis)connection. They depend only on
CCoinsViewCache and CTransaction, with no chainstate coupling. Move
them to coins.{cpp,h} alongside the existing AddCoins/SpendCoin
helpers they naturally extend.

These moves eliminate the need for validation.cpp callers to reach
into validation.h solely for these low-level helpers.

Moving ApplyTxInUndo and UpdateCoins into coins.h introduces a new
expected circular dependency: coins -> undo -> coins. undo.h includes
coins.h for Coin, and coins.h now includes undo.h for BlockUndo.
Extract mempool-related validation functions from validation.cpp into
new files mempool_validation.{h,cpp}:
- MempoolAcceptResult and PackageMempoolAcceptResult result types
- ProcessTransaction, AcceptToMemoryPool, ProcessNewPackage
- LimitMempoolSize, CalculateLockPointsAtTip, CheckSequenceLocksAtTip
- MaybeUpdateMempoolForReorg

Behavioral change: MaybeUpdateMempoolForReorg now takes an explicit
CTxMemPool& mempool parameter instead of obtaining it via
chainstate.GetMempool(). This is required because the Clang Thread
Safety Analyzer cannot alias *chainstate.MempoolMutex() to mempool.cs
when they are obtained via different expressions. The four call sites in
validation.cpp are guarded with `if (m_mempool)` null checks.

Introduces circular dependency:
  mempool_validation -> validation -> mempool_validation
(validation.cpp still includes mempool_validation.h for the transition
period; to be resolved in a follow-up)
Extract block-related validation functions from validation.cpp into
new files block_validation.{h,cpp}:
- CheckBlockHeader, CheckBlock, BlockManuallyRequested
- ConnectBlock, DisconnectBlock, UpdateTip
- ActivateBestChainStep, ActivateBestChain
- InvalidateBlock, ReconsiderBlock
- GetSpendHeight, GetUTXOStats, VerifyScript helpers
- TestBlockValidity, BuildChainState

Also extract script checking infrastructure into script/script_check.{h,cpp}.

Introduces circular dependencies:
  block_validation -> validation -> block_validation
  block_validation -> validation -> mempool_validation -> block_validation
(validation.cpp still includes block_validation.h for the transition
period; to be resolved in a follow-up)
Rename node/chainstate.{cpp,h} to node/chainstate_load.{cpp,h} to avoid
a naming conflict with the upcoming rename of validation.{cpp,h} to
chainstate.{cpp,h}. The node/chainstate module hosts LoadChainstate()
and VerifyLoadedChainstate(), so chainstate_load better describes its
role.
After extracting mempool validation and block validation into their own
files, the code remaining in validation.{cpp,h} is exclusively
chainstate management: Chainstate, ChainstateManager, and their
supporting infrastructure. Rename the files to match what they
actually contain and update all include directives accordingly.

The rename also updates the names of existing circular dependencies in
the lint allowlist: node/blockstorage -> validation -> node/blockstorage
becomes chainstate -> node/blockstorage -> chainstate, and likewise for
kernel/coinstats. The previously introduced
block_validation -> validation -> block_validation and
mempool_validation -> validation -> mempool_validation are updated to
reflect the new filename.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant