Skip to content

feat: add realtime fork as new sibling crate#953

Merged
mikhailUshakoff merged 15 commits intomasterfrom
feat/realtime-fork
May 4, 2026
Merged

feat: add realtime fork as new sibling crate#953
mikhailUshakoff merged 15 commits intomasterfrom
feat/realtime-fork

Conversation

@AnshuJalan
Copy link
Copy Markdown
Collaborator

@AnshuJalan AnshuJalan commented Apr 28, 2026

Summary

Lands the realtime fork as a clean, self-contained sibling of shasta/permissionless/pacaya, ported from surge-real-time-poc + feat/l2-to-l1-to-l2-sync without dragging in the shasta-side and pacaya-side modifications those branches accumulated.

  • New crate realtime/ (~7,000 LOC): L1/L2 execution layers with bridge-callback simulation, async proposal submitter + bridge handler (UserOp status, mempool scan for return signals), realtime chain monitor, Raiko v3 client, deferred-finalization multicall builder, self-contained NodeConfig.
  • Fork::Realtime enum + FromStr + FORK env-var override (default-disabled timestamp; only activates on FORK=realtime).
  • Minimal common deltas consumed by realtime: taiko_driver::reorg_stale_block RPC, BuildPreconfBlockResponse.state_root (parsed leniently — defaults to B256::ZERO when missing so existing shasta/permissionless paths still work), monitor_new_transaction accepts optional tx-hash / tx-result notifiers (existing 2-arg callers untouched).

What's intentionally not here

Relationship to existing PRs

PR #945 (feat/l2-to-l1-to-l2-syncsurge-real-time-poc) and the surge-real-time-pocmaster PR are left open for now as fallback. Once this PR lands, they can be closed.

Test plan

  • cargo build --workspace
  • cargo clippy -p realtime -p common -p shasta -p permissionless -p node -p pacaya --all-features clean (no new warnings)
  • cargo fmt --all -- --check
  • cargo test --workspace — 122 tests pass across 16 suites
  • git diff --stat origin/master..feat/realtime-fork -- shasta/ pacaya/ is empty
  • git diff origin/master..feat/realtime-fork -- permissionless/ is exactly 1 line (state_root placeholder)
  • CI: e2e workflow (e2e_pytest.yml) passes against shasta config — proves shasta is not regressed
  • Manual: FORK=realtime end-to-end on devnet — L2→L1→L2 mempool scan picks up return signal, multicall finalizes correctly

🤖 Generated with Claude Code

Introduces the realtime fork (`realtime/`) as a clean sibling to
shasta/permissionless/pacaya, ported from the surge-real-time-poc and
feat/l2-to-l1-to-l2-sync work without dragging in the shasta-side and
pacaya-side modifications that those branches accumulated.

What's in the new fork:
- L1 + L2 execution layers with bridge-callback simulation
- Async proposal submitter, batch builder, bridge handler (UserOp status
  tracking, mempool scan for return signals)
- Realtime chain monitor for `RealTimeInbox::ProposedAndProved`
- Raiko v3 client; deferred-finalization multicall builder
- Self-contained `NodeConfig` (does not rely on pacaya::node::config,
  which was removed in #941 when pacaya became a utility crate)

Wiring:
- `Fork::Realtime` enum variant + `FromStr` impl + `FORK` env var override
  in common/src/fork_info, default-disabled timestamp (only activates
  when `FORK=realtime` is set)
- `Realtime` match arm in `Node/src/main.rs`
- Workspace `Cargo.toml`: adds realtime member + `sled` dep

Common deltas (consumed by realtime):
- `taiko_driver::reorg_stale_block` RPC + `ReorgStaleBlock{Request,Response}`
- `BuildPreconfBlockResponse.state_root` (parsed leniently — defaults to
  B256::ZERO if missing, so existing shasta/permissionless JSON paths
  remain compatible)
- `transaction_monitor::monitor_new_transaction` accepts optional
  tx-hash and tx-result oneshot notifiers; existing 2-arg callers are
  unaffected (only the new realtime async submitter passes Some)

Untouched from origin/master:
- `shasta/`, `pacaya/` — zero diff
- `permissionless/` — single line: `state_root: B256::ZERO` placeholder
  in the BuildPreconfBlockResponse construction, mirroring
  mikhailUshakoff's PR #939 approach
- `common/src/config/mod.rs` — the `TAIKO_BRIDGE_L2_ADDRESS` →
  `L2_BRIDGE_ADDRESS` rename flagged on PR #945 is intentionally
  deferred to a separate coordinated PR
- `common/src/shared/internal_server.rs` and the warp→axum migration —
  preserved as-is
- Dockerfile + x86-64 CI fixes — separate PR

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AnshuJalan and others added 3 commits April 28, 2026 11:13
…FORK env var

Drops the bespoke `FORK` env-var override + `Fork::FromStr` impl in
favour of the existing per-fork timestamp pattern. Adds
`config.realtime_timestamp_sec` (default 99999999999) which is then
threaded through `ForkInfoConfig`, matching how Shasta and
Permissionless are activated.

To run in realtime mode, set REALTIME_TIMESTAMP_SEC=0 (or any past
timestamp) at startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cargo.lock resolution after the realtime crate's deps no longer pulls
in the affected `lru` version, so cargo-deny's
`advisory-not-detected` warning fails the audit. Removing the now
unused ignore entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread realtime/src/l1/config.rs Outdated
mskrzypkows and others added 8 commits April 28, 2026 16:38
Removes all `#[allow(dead_code)]` markers in realtime/ and the items
they were guarding, rather than silencing warnings. Touches:

- ProtocolConfig: drop unused proof_verifier, signal_service fields and
  get_max_anchor_height_offset getter.
- RaikoClient: drop l2_network/l1_network fields and the corresponding
  RAIKO_L2_NETWORK / RAIKO_L1_NETWORK env reads — they were stored but
  never sent in proof requests (request body uses None).
- ContractAddresses + EthereumL1Config: drop signal_service field,
  L1_SIGNAL_SERVICE_ADDRESS env read, and the dead raiko_client chain
  (RaikoClient is constructed at lib.rs and routed through
  AsyncSubmitter, not ExecutionLayer).
- BridgeHandler: drop the l1_chain_id field and the unused
  parameter chain back to Node::new (lib.rs).
- BatchBuilder: drop unused `metrics` field/param + add_recovered_l2_block,
  add_l2_user_op_id methods.
- BatchManager: drop unused metrics, cancel_token fields and
  reanchor_block method.
- Node: drop unused metrics field/param.
- L2ExecutionLayer: drop unused TaikoConfig field, get_head_l1_origin,
  get_last_synced_block_params_from_geth, decode_block_params_from_tx_data,
  get_anchor_tx_data methods.
- Taiko: drop unused coinbase field, get_protocol_config,
  get_l2_block_by_number, fetch_l2_blocks_until_latest,
  decode_anchor_id_from_tx_data, get_anchor_tx_data wrappers.
- proposal::Proposals type alias.

Net: -249 / +5 lines, no new warnings, all 122 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sled is unmaintained, cargo deny didn't accept it.
@mskrzypkows
Copy link
Copy Markdown
Collaborator

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 29, 2026

Claude finished @mskrzypkows's task in 3m 53s —— View job


PR Review: feat/realtime-fork

Tasks:

  • Gather context and understand the PR
  • Review common/ changes (transaction_monitor, taiko_driver, fork_info)
  • Review realtime/ crate structure and entry point
  • Review L1 execution layer and config
  • Review L2 execution layer and taiko integration
  • Review node/proposal_manager (async_submitter, bridge_handler, batch_builder)
  • Review raiko client and utils
  • Review node/main.rs integration
  • Post comprehensive review

Overall this is a solid, well-structured port. The fork isolation goals are met — shasta/pacaya are untouched, and the common-crate changes are backwards-compatible. The bridge simulation logic and deferred-finalization multicall design are especially well-thought-out. Several issues are worth addressing before merging.


Addressing the existing review comment

realtime/src/l1/config.rs — "Can it be removed?"

The file cannot be removed as-is. ContractAddresses and EthereumL1Config are both actively used: ContractAddresses is cloned and passed into proposal_tx_builder.rs, and EthereumL1Config drives the TryFrom<RealtimeConfig> conversion in ELTrait::new. However, ContractAddresses duplicates the three address fields already stored in ExecutionLayer struct — it exists purely as a return type for contract_addresses(). If the reviewer is asking whether ContractAddresses can be collapsed into direct field access on ExecutionLayer, that's a reasonable simplification, but would require touching several call sites in proposal_tx_builder.rs and bridge_handler.rs. Not blocking, but worth a follow-up.


Bugs / Correctness

1. Signal slot orphaned after failed block advance [realtime/src/node/proposal_manager/mod.rs:425-492]

When advance_head_to_new_l2_block fails, remove_last_l2_block is called — but it only pops the L2 block from the proposal, not the signal slots that were already pushed via add_signal_slot. If pending_return_signal was injected into the anchor's fast-signals list and add_signal_slot succeeded before the error, the proposal's signal_slots vec will contain a slot for a block that was never built. On the next preconfirmation attempt, the mempool scan will re-discover the same tx, add the slot again, resulting in a duplicate that may cause _verifySignalSlots to revert.

To fix: either clear the added signal slots in the error path of add_draft_block_to_proposal, or add a remove_last_signal_slot paired with remove_last_l2_block.

Fix this →

2. Silent stub: transfer_eth_from_l2_to_l1 [realtime/src/l2/execution_layer.rs:143-152]

pub async fn transfer_eth_from_l2_to_l1(...) -> Result<(), Error> {
    warn!("Implement bridge transfer logic here");
    Ok(())
}

This returns Ok(()) silently. If it's reachable from any code path, callers believe the ETH was transferred when it wasn't. Either implement it, return an explicit Err, or add #[allow(dead_code)] with a doc comment making clear it's deliberately unimplemented.

3. assert! panic in async submission [realtime/src/node/proposal_manager/async_submitter.rs:78]

assert!(!self.is_busy(), "Cannot submit while another submission is in flight");

assert! panics unconditionally in production builds. The call site already guards with is_submission_in_progress() but this is still an unsafe invariant. A panic here would kill the preconfirmation loop (caught by the panic hook + cancel token, so it causes a full node restart, but it's still a crash). Should be a guarded return or Err.


Security

4. Unauthenticated RPC server [realtime/src/node/proposal_manager/bridge_handler.rs:166-400]

The surge_sendUserOp, surge_txStatus, and surge_simulateReturnMessage endpoints accept arbitrary input from any network caller. There is no authentication, authorization, or rate limiting. An attacker who can reach the listener address can:

  • Spam the UserOp queue (channel capacity 1024, then blocks)
  • Trigger expensive debug_traceCall RPC calls on L1/L2 nodes on every heartbeat
  • Enumerate tx status entries

At minimum, consider binding to 127.0.0.1 by default (not 0.0.0.0:4545), or adding a simple API-key header check. Fix this →


Architecture

5. Cross-fork pacaya dependency [realtime/src/lib.rs:51, realtime/src/l2/execution_layer.rs:34, realtime/src/l1/execution_layer.rs:35, realtime/src/node/mod.rs (via pacaya::node::operator::Operator)]

realtime/ imports from pacaya/ in several places (pacaya::l2::config::TaikoConfig, pacaya::l1::operators_cache::OperatorError, pacaya::l1::traits::PreconfOperator, pacaya::node::operator::Operator). The PR description says this is a "self-contained sibling" but the Cargo dependency graph tells a different story. If pacaya/ changes break realtime/, the isolation claim is lost. These types should live in common/ (where TaikoConfig arguably belongs) or be duplicated/simplified for the realtime fork.

6. Hardcoded NodeConfig values [realtime/src/lib.rs:74-80]

let node_config = node::config::NodeConfig {
    preconf_heartbeat_ms: config.preconf_heartbeat_ms,
    handover_window_slots: 8,
    handover_start_buffer_ms: 500,
    l1_height_lag: 8,
    simulate_not_submitting_at_the_end_of_epoch: false,
};

The last four fields are hardcoded constants, not read from RealtimeConfig. The config struct NodeConfig exists but only one field is wired. If operators need to tune slot behavior this becomes a code change rather than a config change.


Fragility / Correctness Concerns

7. Bridge __ctx storage slots hardcoded [realtime/src/l1/execution_layer.rs:397-426]

The L1 callback simulation overrides storage slots 253 and 254 of the Bridge contract, assuming they correspond to __ctx.msgHash and __ctx.from + srcChainId. This is correct for the current Taiko Bridge v3 layout, but is not verified at runtime. If the Bridge contract is upgraded with a different storage layout (or if a proxy changes slot positions via delegatecall), this simulation will produce wrong results with no error signal — the caller will receive an incorrect return signal, and finalization will likely revert on-chain.

8. Hardcoded gas and fee values [realtime/src/l2/execution_layer.rs:108, 228-229]

  • Anchor tx: gas(1_000_000) — fixed
  • Bridge processMessage call: gas(3_000_000), max_fee_per_gas(1_000_000_000) (1 gwei)

The anchor gas limit is a copy of existing behavior, but the bridge call's 1 gwei max fee will cause the transaction to be stuck or fail if L2 base fee exceeds 1 gwei. This should be set dynamically from l2_slot_info.base_fee() (matching the anchor tx pattern).


Performance

9. Duplicate blob encoding per proposal [realtime/src/node/proposal_manager/async_submitter.rs:155-184]

The manifest is encoded → compressed → built into a blob sidecar twice per submission: once here to extract blob hashes for the Raiko request, and again inside ProposalTxBuilder::build_propose_blob. The manifest can be several KB and compression is CPU-intensive. Consider caching the sidecar or passing pre-built blob hashes through the Proposal struct.

10. No HTTP timeout on Raiko client [realtime/src/raiko/mod.rs:98-107]

client: Client::new(),

The reqwest::Client is constructed with no timeout. If the Raiko service hangs on a POST, submission_task will be stuck indefinitely, blocking all future proposals (since AsyncSubmitter::is_busy() returns true). Set a per-request timeout:

Client::builder().timeout(Duration::from_secs(30)).build()?

Minor / Nits

11. Duplicate SEND_MESSAGE_SELECTOR constant

Defined identically in both realtime/src/l1/execution_layer.rs:501 and realtime/src/l2/execution_layer.rs:378. Move to realtime/src/shared_abi/mod.rs or a shared constants module.

12. ContractAddresses not shown in RealtimeConfig Display [realtime/src/utils/config.rs:93-108]

The Display impl omits bridge, l2_signal_service, mock_mode, and raiko_max_retries. The startup log is the primary diagnostics tool, and missing these fields makes misconfiguration harder to catch.

13. Fjall DB path hardcoded [realtime/src/node/proposal_manager/bridge_handler.rs:156]

UserOpStatusStore::open("data/user_op_status")?;

Relative path, not configurable. Different deployment environments (containerized, multiple instances) will share or conflict on this path.

14. Status cleanup delay hardcoded to 60s [realtime/src/node/proposal_manager/async_submitter.rs:472]

60 seconds starts after L1 tx submission — if the tx monitor takes time and the proof took time, by the time a client checks surge_txStatus the entry may be gone. Consider starting the cleanup timer only after the Executed status is written, not after submission.

15. getConfig() called twice on startup [realtime/src/l1/execution_layer.rs:83-87, realtime/src/lib.rs:60]

The RealTimeInbox.getConfig() call is made once during ExecutionLayer::new (for logging) and again via fetch_protocol_config() in create_realtime_node. The result from new could be stored in the struct.


Summary table

Category Count
Bug 3 (signal orphan, silent stub, assert! panic)
Security 1 (unauthenticated RPC)
Architecture 2 (pacaya coupling, hardcoded config)
Fragility 2 (storage slots, gas/fee hardcoding)
Performance 2 (duplicate encoding, no HTTP timeout)
Minor/Nit 5

The most important items to address before merge are #1 (signal orphan), #3 (assert! panic), #4 (unauthenticated RPC), #8 (hardcoded max fee), and #10 (no HTTP timeout). The rest are improvements or follow-up candidates.

Picks up the actionable items from the automated review on
#953 and leaves the rest documented as
follow-ups. Net change: +117 / -46 in realtime/.

Bug / correctness:
* (#1) Defer staging additions to the in-flight proposal until
  `advance_head_to_new_l2_block` succeeds. Previously a failed advance
  left orphan signal slots / user ops in the proposal that no L2 block
  corresponds to, so the next attempt would re-add the same slot from
  the mempool scan and `_verifySignalSlots` would revert on duplicate.
  `pending_return_signal` and `pending_mempool_tx_hash` are now only
  consumed in the Ok arm so retries see the same slot.
* (#3) `AsyncSubmitter::submit` no longer panics on the
  `is_busy()` invariant — returns `Err` and the caller propagates.
* (#2) `transfer_eth_from_l2_to_l1` returns an explicit
  "not implemented" `Err` instead of `Ok(())` to make accidental
  wiring loud. Realtime does not run the funds_controller flow.
* (#8) `processMessage` L2 call now uses the L2 slot's actual base
  fee for `max_fee_per_gas` instead of a hardcoded 1 gwei. Matches
  the anchor tx pattern.
* (#10) Raiko reqwest client now has a configurable timeout
  (`RAIKO_TIMEOUT_SEC`, default 30s) — a hung Raiko no longer
  deadlocks the async submitter forever.

Security:
* (#4) `BRIDGE_RPC_ADDR` defaults to `127.0.0.1:4545` instead of
  `0.0.0.0:4545` so the unauthenticated `surge_*` JSON-RPC endpoints
  are not exposed externally unless an operator explicitly opts in.

Usability:
* (#13) The fjall DB path for `UserOpStatusStore` is now configurable
  via `USER_OP_STATUS_DB_PATH` (default `data/user_op_status`),
  threaded through `BatchManager` → `BridgeHandler`.

Diagnostics / cleanup:
* (#12) `RealtimeConfig` Display now includes bridge, l2_signal_service,
  raiko_max_retries, raiko_timeout_sec, mock_mode, bridge_rpc_addr,
  user_op_status_db_path — startup log shows the full picture.
* (#11) `SEND_MESSAGE_SELECTOR` lives in `realtime/src/shared_abi/mod.rs`
  instead of being duplicated in both execution layers.

Deferred (separate PRs / issues):
* (#5) Cross-fork pacaya dependency — design decision; pacaya is a
  shared utility crate on master, deliberately reused.
* (#6) Hardcoded NodeConfig values (handover_window_slots etc.) — fine
  for the only current deployment; expose when a 2nd one shows up.
* (#7) Bridge `__ctx` storage slots hardcoded — needs runtime layout
  check; out of scope here.
* (#9) Duplicate blob encoding per submission — perf optimization.
* (#14) `surge_txStatus` cleanup timing.
* (#15) `getConfig()` called twice on startup.

Quality gate: `cargo build --workspace`, `cargo clippy --all-features`
on touched crates, `cargo fmt --check`, `cargo test --workspace`
(122 passed) all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@AnshuJalan
Copy link
Copy Markdown
Collaborator Author

Addressed in 5957dc8. Triage of the 15 findings:

Fixed (9 items):

# Topic Fix
1 Signal slot orphan after failed advance_head_to_new_l2_block Defer staging to the in-flight proposal until advance succeeds; on failure, leave pending_return_signal / pending_mempool_tx_hash intact so retries see the same slot — no duplicate slot pushed
2 transfer_eth_from_l2_to_l1 silent stub Now returns explicit Err("not implemented for the realtime fork"). Realtime doesn't run the funds_controller flow but Bridgeable requires it.
3 assert! panic in AsyncSubmitter::submit Returns Result<(), Error>; caller propagates
4 Unauthenticated 0.0.0.0:4545 RPC bind Default flipped to 127.0.0.1:4545. Operators that need external exposure set BRIDGE_RPC_ADDR explicitly
8 processMessage hardcoded max_fee_per_gas = 1 gwei Plumbs l2_slot_info.base_fee() through construct_l2_call_tx. Matches the anchor tx pattern
10 No HTTP timeout on Raiko reqwest::Client Configurable RAIKO_TIMEOUT_SEC, default 30s — hung Raiko no longer dead-locks the async submitter
11 Duplicate SEND_MESSAGE_SELECTOR Moved to realtime/src/shared_abi/mod.rs; both execution layers import
12 RealtimeConfig Display omits fields Added bridge, l2_signal_service, raiko_max_retries, raiko_timeout_sec, mock_mode, bridge_rpc_addr, user_op_status_db_path
13 Hardcoded data/user_op_status path New USER_OP_STATUS_DB_PATH env var, default data/user_op_status, threaded through BatchManagerBridgeHandler

Deferred (6 items, by intent):

# Topic Why deferred
5 Cross-fork pacaya dependency Design decision — pacaya/ is now a shared utility crate on master (#941), not a runtime fork; both shasta and permissionless also depend on it. Promoting these primitives to common/ is its own PR that touches all three forks
6 Hardcoded NodeConfig values Realtime devnet is the only deployment for now; will expose to env vars when a 2nd one shows up
7 Bridge __ctx storage slots hardcoded Mitigation requires reading the deployed storage layout JSON at startup — significant scope. Filing as a follow-up; layout is correct for current Bridge v3
9 Duplicate blob encoding per submission Performance optimization, not correctness. Profile first before caching the sidecar on Proposal
14 surge_txStatus cleanup timer at 60s Edge case; move trigger to Executed-status-write in a follow-up
15 getConfig() called twice on startup Cosmetic; two RPCs at boot don't matter

Quality gate after the fix landed on top of the rebased branch (which now includes the sled→fjall switch and tx-monitor handler refactor): cargo build --workspace, cargo clippy --all-features on touched crates, cargo fmt --check, cargo test --workspace (122 passed) all clean.

becaue it's expected that there could be no one to listen for the
result.
@mikhailUshakoff mikhailUshakoff merged commit 55caf87 into master May 4, 2026
6 of 8 checks passed
@mikhailUshakoff mikhailUshakoff deleted the feat/realtime-fork branch May 4, 2026 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants