Skip to content

JIT Costing: The Definitive PR#876

Open
mwaddip wants to merge 35 commits into
ergoplatform:developfrom
mwaddip:jit-costing-final
Open

JIT Costing: The Definitive PR#876
mwaddip wants to merge 35 commits into
ergoplatform:developfrom
mwaddip:jit-costing-final

Conversation

@mwaddip

@mwaddip mwaddip commented Jun 3, 2026

Copy link
Copy Markdown

Closes #193.

Ports JIT costing from sigmastate-interpreter (direct-tree-eval): each IR node
accumulates its Scala cost into the interpreter Context during evaluation, the
per-transaction cost limit is enforced cumulatively across inputs, and per-element
equality costs (DataValueComparer) and SigmaByteWriter serialize costs are
charged. A cost_parity test replays mainnet blocks 700000–700060 and asserts every
transaction's computed JIT cost equals the node's recorded cost (78 vectors).

Supersedes #854. Same logical change, re-sliced from #854's tangled history into
24 dependency-ordered, individually-green commits — each builds and passes the
ergotree-interpreter / ergotree-ir suites plus cargo clippy --all-features --all-targets -D warnings. One conformance fix is folded in: SigmaProp == SigmaProp
now charges per Scala equalSigmaBoolean (ProveDlog 174 / ProveDHTuple 690) instead of
the flat catch-all 3.

Relationship to #858 (lazy constant resolution)

Independent PR, overlapping in reduce_to_crypto / ConstPlaceholder handling — whichever
merges first, the other takes a mechanical rebase. This PR resolves segregated
ConstPlaceholder via ConstantPlaceholder::resolved (populated by
ErgoTree::proposition_for_cost_eval) and charges it JitCost(1); #858 introduces
Context.constants lazy resolution. Reconciliation either way: the ConstPlaceholder
arm ends up charging 1 JitCost and resolving via ctx.constants, dropping this
PR's resolved-field machinery as redundant. Worked reconciliation on the
ergo-node-integration branch (9094f693); full rebase recipe pinned on #858.

mwaddip and others added 24 commits June 3, 2026 22:29
Introduce the JIT cost accumulator and limit on the interpreter Context (jit_cost / jit_cost_limit fields; add_jit_cost / add_per_item_jit_cost / jit_cost_value / reset_jit_cost; CostLimitExceeded), wire the supporting cost_accum and error plumbing, and initialise the new fields in every Context constructor (arbitrary + wallet::signing::make_context). Remove the dead pre-costing costs.rs stub (zero consumers).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge each node's Scala base cost (FixedCost) at eval time: the per-Expr arms (Const=5, Global=5, Context=1), the ~40 single-cost eval ops, GlobalVars (Height=26, Self/Outputs/Inputs/GroupGenerator=10, MinerPubKey=20), method_call, and ConcreteCollection=20.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge each builtin-type method's Scala FixedCost in its EVAL_FN: Context (dataInputs/headers/preHeader/lastBlockUtxoRootHash 15, selfBoxIndex/minerPubKey 20, getVarFromInput 10), Box (value 8, getReg 50, tokens 15), Header/PreHeader fields (10 each; Header.checkPow 700), GroupElement, AvlTree (15 each; updateOperations 45, updateDigest 40), and Global generic methods (groupGenerator/xor/fromBigEndianBytes 10, some/none 5, deserialize per-item 100/32/32).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge each BinOp's Scala per-kind cost and route equality through DataValueComparer (per-element comparison costs). Includes the folded SigmaProp equality conformance fix: SigmaProp == SigmaProp now charges per Scala equalSigmaBoolean (ProveDlog 174 / ProveDHTuple 690) instead of the flat catch-all 3, matching the JVM equalDataValues on a rare type-valid path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cost reduce_to_crypto: accumulate JIT cost as a delta from the caller's accumulator (cumulative across inputs, enabling per-tx limit enforcement), preserve ConstPlaceholders for cost eval via ErgoTree::proposition_for_cost_eval (+ Constant.resolved / Expr::resolve_placeholders), short-circuit trivial SigmaProp-constant trees (P2PK) at Scala's EvalSigmaPropConstant flat 50, and propagate the accumulated cost through the verifier (crypto_cost).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Enforce the per-transaction JIT cost limit cumulatively across all transaction inputs: derive the cost limit from chain parameters, share one Context accumulator across the per-input reduce_to_crypto calls so their costs sum, and fail verification when the cumulative cost exceeds the limit. Threaded through tx_context (validate), parameters, ergo_transaction, and the ergo-lib-wasm binding.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge Coll.indexOf its Scala cost: PerItemCost(20,10,2) over the iterations actually performed, plus the element-type equality cost per comparison via DataValueComparer::eq_with_cost. A bare == previously left both uncharged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Introduce eval_lambda_1arg, the shared collection-HOF lambda helper that charges ADD_TO_ENV_COST=5 per binding, and route flatMap through it (charged once per input element). Also charge flatMap's output-length PerItemCost(60,10,8) over the flattened result, matching Scala flatMap_eval.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge ADD_TO_ENV_COST per value binding in Apply (FuncValue invocation) and BlockValue (val definitions), matching Scala's per-binding AddToEnvironment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Route the collection HOFs map/filter/fold/exists/forall through eval_lambda_1arg (ADD_TO_ENV charged per element) and charge their Scala per-iteration costs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Many/reverse/startsWith/endsWith/get)

Charge the Scala PerItemCost for the remaining Coll methods: zip, indices, patch, updated, updateMany, reverse, startsWith, endsWith, get.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge SubstConstants its Scala cost scaled by the number of template constants substituted.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge Coll.slice its Scala cost over the requested (to - from) range.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge SigmaProp.propBytes its Scala cost scaled by the SigmaBoolean proposition node count (adds SigmaBoolean::size()).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge And and Blake2b256 their per-collection-element Scala costs, and charge boolean-constant collections N*5 per element to match the Exprs form. Adds empty-collection regression tests (and_empty, calc_blake2b256_empty, bool_constants_coll).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge the Scala FixedCost for the v6.0 numeric and UnsignedBigInt methods.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge Global.encodeNbits=25 / decodeNbits=50 (FixedCost) and powHit's PowHitCostKind formula 500 + (k+1) * (totalLen/128 + 1) * 7, with regression tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a SigmaByteWriter serialize-cost accumulator (enable_serialize_cost_tracking / serialize_cost / add_put_byte_cost / add_put_numeric_cost / add_put_chunk_cost) and charge the direct DataSerializer types: Boolean/Byte = PutByte, Short/Int/Long and all length prefixes = PutNumeric, String/Coll bytes = PutChunk, Opt tag = PutByte. Wire Global.serialize to meter the writer puts (StartWriterCost 10 + tracked put costs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge the serialize put-cost for GroupElement (PutChunk over GROUP_SIZE), SigmaBoolean/SigmaProp, and BigInt256/UnsignedBigInt256 sigma_serialize.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge the serialize put-cost for ErgoBox: the register type-code prefix (TypeCode::sigma_serialize PutByte), creationHeight, value, and TxId, matching Scala's box serialization metering.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge the AvlTree serialize put-cost (digest + flags + key/value-length fields) during Global.serialize.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge the PutByte cost for the SType fast-path combined type-code arms (Coll/Nested/Option/Tuple-pair over primitives), with a regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charge the Header serialize put-cost by mirroring the JVM put sequence (HeaderWithoutPow + AutolykosSolution; Header lives in ergo-chain-types and can't reach the cost sink), reaching the blessed v6 Global.serialize[Header] = 333. Adds the cross-type serialize_charges_writer_costkinds regression test covering byte/numeric/coll (c17), GroupElement/UnsignedBigInt (c18), AvlTree (c20) and Header.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add the ergo-lib cost_parity integration test replaying mainnet blocks 700000-700060 and asserting each transaction's computed JIT cost matches the node's recorded cost (78 vectors), pinning the full costing model to consensus. Includes the headers / transactions / tx_costs test vectors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mwaddip and others added 4 commits June 4, 2026 12:34
Scala NumericCastCostKind bills both SBigInt and SUnsignedBigInt at
JitCost(30) (others 10); Upcast/Downcast were billing UnsignedBigInt the
else-branch 10. Mirror the oracle so the JIT cost of a numeric cast to
UnsignedBigInt matches the JVM.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tions

The seven verifier-backed AvlTree methods (contains, get, getMany, insert,
update, remove, insertOrUpdate) performed real proof verification but charged
no JIT cost at all — only the flat property accessors were costed. The JVM
prices every verifier op (CErgoTreeEvaluator):

- createVerifier: PerItemCost(110, 20, 64) over proof bytes — tree
  reconstruction is O(proof.length)
- contains/get/getMany: LookupAvlTree PerItemCost(40, 10, 1) x treeHeight
  per key
- insert: isInsertAllowed (15) + InsertIntoAvlTree PerItemCost(40, 10, 1)
  x max(treeHeight, 1) per entry
- update/insertOrUpdate: isUpdateAllowed (+isInsertAllowed for the latter)
  + UpdateAvlTree PerItemCost(120, 20, 1) x max(treeHeight, 1) per entry
- remove: isRemoveAllowed + RemoveAvlTree PerItemCost(100, 15, 1)
  x max(treeHeight, 1) per key + an unconditional digest read (15)
- updateDigest (40) when a mutation yields the updated tree

treeHeight is the digest's trailing height byte — the same source Scala's
BatchAVLVerifier.rootNodeHeight reads. Disallowed mutations charge only the
flag check, mirroring Scala's early None.

Any script touching an AVL verifier op under-counted the aggregate tx cost
(the consensus-risky direction near MaxBlockCost). Surfaced by the SANTA
transaction tier on captured testnet txs: block 2,666 (-20 block cost,
getMany over a 114-byte proof) and block 28,474 (-255, get + remove over
~3.3 KB proofs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…er tree byte

Trees containing DeserializeContext/DeserializeRegister nodes go through a
whole-tree substitution pass that the JVM prices at ergoTree.bytes.length x
CostPerTreeByte(2) block cost (Interpreter.reductionWithDeserialize); Rust's
substitute_deserialize charged nothing.

The charge is limit-checked in every era. Since V6 activation it is also part
of the reported cost; pre-V6 the JVM passes the un-bumped context on, so the
result excludes it — mirrored by rolling the accumulator back after the limit
check when the activated script version is below V3.

Surfaced by the SANTA transaction tier on the captured testnet tx at block
111,927 (DeserializeContext spend): aggregate tx cost short by exactly 558 =
279 tree bytes x 2.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ult arm, not BigInt's

Scala's arith cost table special-cases ONLY SBigInt — every
ArithOpCompanion.costKind in trees.scala is 'case SBigInt => …;
case _ => …' — so SUnsignedBigInt arithmetic rides the default arm:
Plus/Minus 15, Multiply/Divide/Modulo 15, Max/Min 5 JitCost. eni routed
Value::UnsignedBigInt into the BigInt arm (20/25/10), overcharging every
UBI arith op.

Surfaced by the SANTA transaction tier on the captured testnet tx at block
28,474 (two UBI divisions, +10 JitCost each) and confirmed by the SANTA
decomposition spike: with the AvlTree verifier costs in place, the tx
graded +2 block over the JVM — exactly the two misclassified divisions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mwaddip and others added 6 commits June 6, 2026 01:40
…erialize nodes

Each DeserializeContext/DeserializeRegister node that actually substitutes
charges the JVM's deserialization complexity — bytes.length x
CostPerByteDeserialized(2) block cost (Interpreter.deserializeMeasured,
reached from substDeserialize for context vars and from
ErgoLikeInterpreter.substDeserialize for registers). Unlike the
per-tree-byte presence charge, this lands in the reported cost pre-V6 too
(the JVM threads it through the substitution context in both eras). An
absent var (node left in place) or an absent register falling back to
'default' charges nothing.

The substitution can now trip the cost limit, so SubstDeserializeError
gains a CostLimitExceeded variant.

Found by the SANTA decomposition spike (sigma-rust-tx-cost-decomposition-
followup.md): the 111,927 capture's var is absent, so the charge was
invisible in the tx-tier deltas — only the presence charge showed there.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mit errors

reduce_to_crypto's retry arms matched a bare EvalError::CostError, but
enrich_err wraps every error — including CostError — in Spanned at each
eval node boundary, so the arm never fired and every cost-limit trip took
the diagnostic retry. The retry resets the accumulator and re-evaluates
the constant-substituted proposition() tree, which charges on a different
lattice (substituted Constants cost 5 JitCost where placeholders cost 1),
making the charged cost — and near the budget even the verdict —
path-dependent.

EvalError::is_cost_error() unwraps the span wrappers (and the
deserialize-substitution carrier) before matching; both retry sites now
guard on it and propagate the original error.

Found by the SANTA decomposition spike while instrumenting near the
per-tx budget.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…me chunk

Five writer events in the type serializer were uncharged under
Global.serialize cost tracking, where the JVM meters them (every
putUByte funnels into the costed put(Byte) = PutByteCost(1), putBytes
= PutChunkCost(3+n); TypeSerializer.scala:113/118/125-126/248):

- >4-arity tuple item-count byte (the one adversarially reachable
  site: a 5-tuple register type on a box fed to Global.serialize)
- SFunc tDom length byte (V3+)
- SFunc tpeParams length byte (V3+)
- STypeVar name-length byte
- STypeVar name bytes block (PutChunkCost over the length)

SFunc/STypeVar register types cannot carry a DataSerializer value, so
those sites are defensive parity. Pinned by a writer-level unit test
(formula-derived: 5-tuple 7 vs 4-tuple 5, STypeVar 6, SFunc 11) and an
end-to-end Global.serialize[Box] arity-boundary test (R4 = 5-tuple vs
4-tuple of bytes differs by exactly +3: count byte + extra type code +
extra data put; pre-fix delta was 2).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… serialize cost tracking

The JVM writes every expression/SigmaBoolean op code through the costed
w.put(opCode) (ValueSerializer / SigmaBoolean.serializer) and the Tuple
expression's item-count byte through TupleSerializer's putUByte
(PutByteCost each). Both reach Global.serialize via a legacy
box-register tuple expression (RegisterValue::ParsedTupleExpr) -- and
SigmaBoolean op codes via any serialized SigmaProp.

Meter the op-code byte inside OpCode::sigma_serialize (mirroring
TypeCode::sigma_serialize) and drop SigmaBoolean's now-redundant
explicit charge -- net SigmaBoolean cost is unchanged; charge the Tuple
item-count byte at its write. Pinned end-to-end: a (Byte,Byte)
tuple-expr register costs exactly +3 over the same data as a Constant
register (opcode + count + per-item type codes vs the pair's combined
code); pre-fix delta was 1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…erialize cost tracking

The JVM's SigmaBoolean.serializer writes CAND/COR child counts and
CTHRESHOLD's k + child count via putUShort, each costed
PutUnsignedNumericCost(3) under Global.serialize
(SigmaBoolean.scala:48/55/61/63); sigma-rust's put_u16 writes at these
four sites were uncharged. Adversarially reachable: any script
conjunction evaluates to a CAND/COR/CTHRESHOLD SigmaProp that can be
fed to Global.serialize.

Pinned end-to-end with dlog = opcode(1) + GroupElement chunk(36) = 37:
CAND/COR of two dlogs = +41 over a single dlog (opcode + count
putUShort(3) + the second dlog; pre-fix +38), CTHRESHOLD +3 more for k
(pre-fix +0).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Under Global.serialize[Box] the JVM writes each token id via
putBytes => PutChunkCost(32) = 35 and each amount via putULong =>
PutUnsignedNumericCost(3) (ErgoBoxCandidate.scala:158/160); both
sigma-rust writes were uncharged. The indexed-digest arm mirrors the
JVM's no-info putUInt, which is genuinely unmetered (and is never
taken under Global.serialize -- sigmaSerializer passes None).

Pinned end-to-end: a one-token box costs exactly +38 over a token-less
twin (pre-fix +0); the token-count byte is charged for zero tokens too
and cancels.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…annot start

Scala's CAvlTreeVerifier.treeHeight reads BatchAVLVerifier.rootNodeHeight,
which is assigned from the digest's trailing byte only after reconstruction's
up-front requires (keyLength > 0, digest length) pass. A non-positive
keyLength (signed Int on the JVM; wire values with the high bit set) fails
before the assignment, so the JVM charges the degenerate tree a zero-height
walk. We read the digest byte unconditionally, overcharging the height-scaled
Lookup/Insert/Update/Remove ops on those shapes. Failures during proof
parsing (malformed proof bytes, wrong value length) happen after the
assignment and correctly keep the digest-derived height.

The degenerate path is observable end-to-end only with
ergoplatform/ergo_avltree_rust#14 (the crates.io verifier panics on these
inputs before any per-op charge is reached); pinned here at the unit level.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mwaddip

mwaddip commented Jun 7, 2026

Copy link
Copy Markdown
Author

Pushed 592d4f9 — degenerate-tree op-cost height. Scala's CAvlTreeVerifier.treeHeight reads BatchAVLVerifier.rootNodeHeight, assigned from the digest's trailing byte only after reconstruction's up-front requires (keyLength > 0, digest length) pass; when they fail, no root is built and the JVM charges a zero-height walk. We read the digest byte unconditionally, overcharging the height-scaled AvlTree ops on trees with non-positive keyLength (a signed Int on the JVM). Note: the degenerate path is reachable end-to-end only with ergoplatform/ergo_avltree_rust#14 (the crates.io verifier panics on these inputs before costing is reached), so in-repo coverage is a unit pin on tree_height; validated against JVM-blessed conformance vectors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JIT costing

1 participant