feat: rpc fallback & metrics validation#2324
Conversation
📝 WalkthroughWalkthroughThis PR extends Forester's RPC resilience with fallback configuration, comprehensively refactors the metrics system using macros with centralized definitions, enriches queue data structures with batch and capacity tracking, and updates the dashboard and status reporting to expose registration and queue state information. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Simplified the handling of epoch registration and processing by removing redundant checks and consolidating error handling. - Enhanced logging for better traceability of epoch processing events. - Updated the logic to always process the current epoch regardless of registration status. - Improved the registration phase checks to determine if registration is open or closed. - Adjusted the handling of target epochs to streamline the flow of sending epochs for processing. Enhance ForesterStatus to include registration state - Added `registration_is_open` field to `ForesterStatus` to indicate if registration is currently open. - Updated logic in `get_forester_status_with_options` to determine the current registration state and adjust the next registration epoch accordingly. - Modified calculations for slots until next registration based on the registration state. Update queue helpers to track total ZKP batches - Extended `V2QueueInfo` and related structures to include `input_total_zkp_batches` and `output_total_zkp_batches`. - Adjusted parsing functions to calculate total ZKP batches based on batch size and ZKP batch size. - Ensured compatibility with existing queue processing logic. Deploy script adjustments - Updated the deploy script to focus on the `light_registry` library, removing references to other libraries.
4b96f12 to
f0b8e74
Compare
- Introduced fallback RPC and indexer URLs in the environment configuration and CLI arguments to enhance resilience. - Updated README.md to document new fallback options and their usage. - Modified the API server and RPC pool to support automatic failover to fallback URLs on health check failures. - Enhanced the configuration structs and tests to include fallback settings. - Refactored related code for improved readability and maintainability.
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
forester/src/epoch_manager.rs (1)
950-1011:⚠️ Potential issue | 🟠 MajorPrevent
forester_epoch_*gauges from moving backwards.At Line [950] and Line [1011], metrics are set per processed epoch. Because older/newer epochs are queued and processed concurrently (e.g., Line [886]-[917]), an older epoch can overwrite a newer gauge value. For “latest epoch” metrics, this produces stale/regressing values.
Use monotonic update semantics (
max(existing, incoming)) in metric helpers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@forester/src/epoch_manager.rs` around lines 950 - 1011, The epoch metrics can regress because update_epoch_detected and update_epoch_registered set gauges directly and concurrent processing can overwrite newer values with older ones; change their implementations (the metric helper(s) used by update_epoch_detected/update_epoch_registered) to perform monotonic updates by reading the current gauge value and setting it to max(current, incoming) (i.e., only update when incoming > current) so gauges never move backwards; apply this max(existing, incoming) logic in the metric helper(s) so any callers like update_epoch_detected/update_epoch_registered benefit without changing call sites.forester/src/config.rs (1)
150-160:⚠️ Potential issue | 🟠 Major
RpcPoolConfig::default()is unsafe with derived zero-values.At Line 150,
#[derive(Default)]now makesfailure_thresholdandprimary_probe_interval_secsdefault to0, which is an unsafe runtime baseline for pool health behavior.🔧 Proposed fix
-#[derive(Debug, Clone, Copy, Default)] +#[derive(Debug, Clone, Copy)] pub struct RpcPoolConfig { pub max_size: u32, pub connection_timeout_secs: u64, pub idle_timeout_secs: u64, pub max_retries: u32, pub initial_retry_delay_ms: u64, pub max_retry_delay_ms: u64, pub failure_threshold: u64, pub primary_probe_interval_secs: u64, } + +impl Default for RpcPoolConfig { + fn default() -> Self { + Self { + max_size: 10, + connection_timeout_secs: 15, + idle_timeout_secs: 300, + max_retries: 10, + initial_retry_delay_ms: 1000, + max_retry_delay_ms: 16000, + failure_threshold: 3, + primary_probe_interval_secs: 30, + } + } +}Based on learnings:
RPC pool configuration should use --rpc-pool-size (default 10), --rpc-pool-connection-timeout-secs (default 15), and --rpc-pool-idle-timeout-secs (default 300) for optimizing connection pooling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@forester/src/config.rs` around lines 150 - 160, RpcPoolConfig's derived Default yields unsafe zero-values for fields like failure_threshold and primary_probe_interval_secs; replace the derived Default with an explicit impl Default for RpcPoolConfig that returns safe, non-zero defaults (e.g., max_size = 10, connection_timeout_secs = 15, idle_timeout_secs = 300) and set sensible values for max_retries, initial_retry_delay_ms, max_retry_delay_ms, failure_threshold, and primary_probe_interval_secs so pool health behavior is initialized correctly; update or remove #[derive(Default)] and implement RpcPoolConfig::default() to return these recommended defaults.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@forester/.env.example`:
- Line 61: The dotenv-linter warning is caused by an unquoted environment value;
update the PROCESSOR_MODE declaration (the `PROCESSOR_MODE` variable in
.env.example) to use quotes around the value (e.g., "all") so dotenv-linter no
longer flags ValueWithoutQuotes, ensuring the variable remains functionally
equivalent but lint-clean.
- Around line 25-30: The .env.example file is missing the top-level bootstrap
instruction required by our setup guidelines; add a clear one-line note at the
top of forester/.env.example instructing first-time users to copy the example
into .env (for example: "cp .env.example .env") so the Forester service is
initialized correctly before editing environment values; update the file near
the header (the commented RPC fallback block such as
FALLBACK_RPC_URL/FALLBACK_INDEXER_URL) to include this instruction as the first
visible line.
In `@forester/dashboard/tsconfig.tsbuildinfo`:
- Line 1: The PR accidentally adds the generated TypeScript build cache file
tsconfig.tsbuildinfo; remove this file from the commit/PR and update Git ignore
rules by adding *.tsbuildinfo (or forester/dashboard/*.tsbuildinfo) to
.gitignore so tsbuildinfo files are not tracked going forward, then commit the
.gitignore change and use git rm --cached tsconfig.tsbuildinfo (or equivalent)
before pushing the updated branch.
In `@forester/src/api_server.rs`:
- Around line 1177-1181: The prometheus_route currently never sees real errors
because metrics::metrics_handler swallows encoding failures and returns
Ok(empty_string); change metrics::metrics_handler to propagate failures (return
Result<String, E> or Result<impl Reply, E> instead of Ok("")) and update
prometheus_route to translate an Err into an explicit 5xx warp response (e.g.,
return a warp rejection or a warp reply with StatusCode::INTERNAL_SERVER_ERROR)
instead of treating everything as success; reference the prometheus_route and
metrics::metrics_handler symbols when making these changes so encoding errors
surface as HTTP 5xx responses to Prometheus scrapes.
In `@forester/src/cli.rs`:
- Around line 174-180: The CLI accepts zero for failover timing which causes
pathological behavior; update the argument definitions for
rpc_pool_failure_threshold and rpc_pool_primary_probe_interval_secs to validate
non-zero values by adding a value_parser using
clap::value_parser!(u64).range(1..) so parsing rejects zero; modify the
attribute on the rpc_pool_failure_threshold field and the
rpc_pool_primary_probe_interval_secs field (the struct-level arg annotations) to
include value_parser = clap::value_parser!(u64).range(1..) so users cannot
supply 0 at parse time.
In `@forester/src/config.rs`:
- Around line 351-352: The config currently assigns
args.rpc_pool_failure_threshold and args.rpc_pool_primary_probe_interval_secs
directly (failure_threshold and primary_probe_interval_secs), allowing zero
values; add a lower-bound validation where configuration is built (the code that
sets failure_threshold and primary_probe_interval_secs) to ensure both values
are >= 1 (or fail startup with a clear error), e.g., check
args.rpc_pool_failure_threshold >= 1 and
args.rpc_pool_primary_probe_interval_secs >= 1 and return/propagate a
configuration error if not, so invalid zero inputs cannot be used.
In `@forester/src/epoch_manager.rs`:
- Around line 818-827: The send failure for target epochs should be treated as
terminal: in the epoch monitor where tx.send(target_epoch).await is handled (the
block that currently logs error with event =
"epoch_monitor_send_target_epoch_failed" and then continues), change the control
flow to stop the monitor instead of continuing—mirror the fatal handling used
for current-epoch send failures by returning/propagating an error or breaking
out of the task after logging the error so the loop does not spin; update the
code surrounding tx.send(target_epoch).await and its error branch to terminate
the monitor (using the same pattern used for current-epoch send failures) while
keeping the existing log message and fields (self.run_id, target_epoch, ?e).
In `@forester/src/forester_status.rs`:
- Around line 792-822: The AddressV2 branch computes queue_len in "items"
(input_pending_batches * zkp_batch_size) but queue_cap is derived as
batches.len() * (batch_size / zkp_batch_size), causing a unit mismatch; update
TreeType::AddressV2 (in the code that calls parse_address_v2_queue_info and
computes queue_len/queue_cap) to use consistent units: either both represent
total items or both represent batch slots — e.g., to keep items, set queue_cap =
Some(v2_info.batches.len() as u64 * v2_info.zkp_batch_size as u64) (guard
zkp_batch_size>0) and ensure all casts to u64, or if you prefer batches, compute
queue_len = Some(v2_info.input_pending_batches) and queue_cap =
Some(v2_info.batches.len() as u64); make the corresponding change where
queue_len and queue_cap are returned so callers get consistent units.
- Around line 773-779: The current queue_cap computation uses ZKP batch slots;
update it to represent total item slots by multiplying batches.len() by
batch_size instead of (batch_size / zkp_batch_size). Locate where queue_cap is
computed from v2_info (variable name queue_cap inside StateV2 and the analogous
block in AddressV2) and replace the inner expression to compute i.batches.len()
as u64 * i.batch_size (preserve the existing guard for zero-sized or missing
v2_info); ensure both StateV2 and AddressV2 use the same change so
queue_capacity and queue_length are in the same item units.
In `@forester/src/lib.rs`:
- Line 430: A background recovery probe is started via
arc_pool.spawn_primary_recovery_probe() and assigned to recovery_probe_handle
before a later fallible operation returns with ?, which can exit early and never
reach the existing abort block — leaking the spawned task. Fix by delaying
creation of recovery_probe_handle until after the fallible call succeeds or by
installing a guard that will abort recovery_probe_handle on
unwinding/early-return (e.g., wrap recovery_probe_handle in a Drop/ScopeGuard or
use a try { ... } pattern that ensures the existing abort block runs); reference
arc_pool.spawn_primary_recovery_probe, recovery_probe_handle, and the abort
block that currently runs at the end so the spawned task is always aborted on
every error path.
---
Outside diff comments:
In `@forester/src/config.rs`:
- Around line 150-160: RpcPoolConfig's derived Default yields unsafe zero-values
for fields like failure_threshold and primary_probe_interval_secs; replace the
derived Default with an explicit impl Default for RpcPoolConfig that returns
safe, non-zero defaults (e.g., max_size = 10, connection_timeout_secs = 15,
idle_timeout_secs = 300) and set sensible values for max_retries,
initial_retry_delay_ms, max_retry_delay_ms, failure_threshold, and
primary_probe_interval_secs so pool health behavior is initialized correctly;
update or remove #[derive(Default)] and implement RpcPoolConfig::default() to
return these recommended defaults.
In `@forester/src/epoch_manager.rs`:
- Around line 950-1011: The epoch metrics can regress because
update_epoch_detected and update_epoch_registered set gauges directly and
concurrent processing can overwrite newer values with older ones; change their
implementations (the metric helper(s) used by
update_epoch_detected/update_epoch_registered) to perform monotonic updates by
reading the current gauge value and setting it to max(current, incoming) (i.e.,
only update when incoming > current) so gauges never move backwards; apply this
max(existing, incoming) logic in the metric helper(s) so any callers like
update_epoch_detected/update_epoch_registered benefit without changing call
sites.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (4)
.github/workflows/metrics-contract.ymlis excluded by none and included by noneexternal/photonis excluded by none and included by noneforester-utils/src/rpc_pool.rsis excluded by none and included by nonescripts/deploy/deploy.shis excluded by none and included by none
📒 Files selected for processing (26)
forester/.env.exampleforester/.gitignoreforester/README.mdforester/dashboard/src/app/page.tsxforester/dashboard/src/components/EpochCard.tsxforester/dashboard/src/components/TreeTable.tsxforester/dashboard/src/types/forester.tsforester/dashboard/tsconfig.tsbuildinfoforester/metrics-contract.jsonforester/src/api_server.rsforester/src/cli.rsforester/src/config.rsforester/src/epoch_manager.rsforester/src/forester_status.rsforester/src/lib.rsforester/src/metrics.rsforester/src/processor/v1/helpers.rsforester/src/processor/v1/send_transaction.rsforester/src/processor/v2/common.rsforester/src/queue_helpers.rsforester/tests/e2e_test.rsforester/tests/legacy/priority_fee_test.rsforester/tests/legacy/test_utils.rsforester/tests/metrics_contract_test.rsforester/tests/priority_fee_test.rsforester/tests/test_utils.rs
- Added value parser with range validation for `rpc_pool_failure_threshold` and `rpc_pool_primary_probe_interval_secs` in CLI arguments to ensure they are greater than 0. - Updated error handling in `EpochManager` to return an error when the epoch channel is closed instead of continuing the loop. - Simplified queue capacity calculation in `parse_tree_status` by removing unnecessary checks for `zkp_batch_size`. - Moved the spawning of the primary recovery probe to after the pool is built in `run_pipeline_with_run_id` for better clarity and structure.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
forester/src/lib.rs (1)
469-474:⚠️ Potential issue | 🟠 MajorAbort
slot_tracker_handleon tracker-init error path.
slot_tracker_handlestarts at Line [444], but Line [472]-[473] can early-return with?. That leaves the SlotTracker task running after pipeline init failed.🛠️ Proposed fix
let tracker_handles = if let Some(trackers) = preconfigured_trackers { trackers } else { - initialize_compressible_trackers(config.clone(), shutdown_compressible, shutdown_bootstrap) - .await? + match initialize_compressible_trackers( + config.clone(), + shutdown_compressible, + shutdown_bootstrap, + ) + .await + { + Ok(handles) => handles, + Err(e) => { + slot_tracker_handle.abort(); + return Err(e); + } + } };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@forester/src/lib.rs` around lines 469 - 474, The SlotTracker task started in slot_tracker_handle must be aborted if tracker initialization fails; modify the branch around initialize_compressible_trackers so that on the error path you call slot_tracker_handle.abort() (or otherwise stop/join it) before returning the propagated error from initialize_compressible_trackers; locate the variables slot_tracker_handle, preconfigured_trackers, and initialize_compressible_trackers to add the abort call just prior to the early return (the assignment to tracker_handles) so the SlotTracker does not remain running after pipeline init fails.forester/.env.example (1)
25-30:⚠️ Potential issue | 🟡 MinorAdd the required
.envbootstrap instruction at file top.Please add
cp .env.example .envas the first visible line/comment in this file so first-time setup matches the Forester setup flow.Proposed fix
+## First-time setup: cp .env.example .env + # Core RPC Configuration export RPC_URL="https://api.devnet.solana.com"As per coding guidelines,
forester/**/.env.example: Copy the example environment file usingcp .env.example .envbefore configuring the Forester service.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@forester/.env.example` around lines 25 - 30, The .env.example file is missing the bootstrap instruction to copy the example into an active .env; add a visible top-line comment containing the command cp .env.example .env as the first visible line in the file (before any other commented variables) so first-time setup follows the Forester setup flow and users know to create their .env from .env.example.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@forester/src/epoch_manager.rs`:
- Around line 950-951: The spawned process_epoch tasks call
update_epoch_detected and update_epoch_registered which currently call
forester_epoch_detected.set(...) / forester_epoch_registered.set(...) and allow
older tasks to regress the metric; change the gauge backing to a monotonic
atomic (e.g., store an AtomicI64 or AtomicU64 per gauge) and update the setter
functions (the gauge setter helpers used by
update_epoch_detected/update_epoch_registered defined alongside
forester_epoch_detected/forester_epoch_registered) to perform a compare-and-swap
loop that only replaces the stored value when the new epoch is greater
(monotonic-max), then update the exported gauge from that atomic, ensuring
process_epoch concurrent completions cannot move the reported latest epoch
backward.
In `@forester/src/forester_status.rs`:
- Around line 172-200: The code incorrectly uses
protocol_config_pda.config.is_registration_phase(slot).is_ok() (which always
returns Ok due to disabled validation) to set registration_is_open; instead call
the epoch-phase inspector to get concrete phase boundaries and compare the slot
against the registration phase range. Replace the is_ok() check with a direct
boundary test using protocol_config_pda.config.get_epoch_phases(slot) (or
get_epoch_phases for the current/latest epoch), extract the registration phase
start/end (or registration_phase.range), set registration_is_open = slot >=
registration_start && slot < registration_end, and then compute
current_registration_epoch and slots_until_next_registration by subtracting slot
from the appropriate phase end/start using latest_register_epoch,
registration_phase_length, active_phase_length and
protocol_config_pda.config.get_latest_register_epoch_progress(slot) as before.
---
Duplicate comments:
In `@forester/.env.example`:
- Around line 25-30: The .env.example file is missing the bootstrap instruction
to copy the example into an active .env; add a visible top-line comment
containing the command cp .env.example .env as the first visible line in the
file (before any other commented variables) so first-time setup follows the
Forester setup flow and users know to create their .env from .env.example.
In `@forester/src/lib.rs`:
- Around line 469-474: The SlotTracker task started in slot_tracker_handle must
be aborted if tracker initialization fails; modify the branch around
initialize_compressible_trackers so that on the error path you call
slot_tracker_handle.abort() (or otherwise stop/join it) before returning the
propagated error from initialize_compressible_trackers; locate the variables
slot_tracker_handle, preconfigured_trackers, and
initialize_compressible_trackers to add the abort call just prior to the early
return (the assignment to tracker_handles) so the SlotTracker does not remain
running after pipeline init fails.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (6)
forester/.env.exampleforester/dashboard/.gitignoreforester/src/cli.rsforester/src/epoch_manager.rsforester/src/forester_status.rsforester/src/lib.rs
| update_epoch_detected(epoch); | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Script 1: Show per-epoch async task spawning
echo "=== PER-EPOCH ASYNC TASK SPAWNING ==="
rg -n "epoch_queued_for_processing|tokio::spawn|process_epoch" forester/src/epoch_manager.rs -A8 -B8
echo ""
echo "=== GAUGE UPDATES IN PROCESS_EPOCH ==="
# Script 2: Show where epoch gauges are updated
rg -n "update_epoch_detected|update_epoch_registered" forester/src/epoch_manager.rs -A4 -B4
echo ""
echo "=== GAUGE SETTER IMPLEMENTATION ==="
# Script 3: Show gauge setters
rg -n "pub fn update_epoch_detected|pub fn update_epoch_registered|set" forester/src/metrics.rs -A8 -B3 | head -100Repository: Lightprotocol/light-protocol
Length of output: 10425
Fix metric regressions from concurrent epoch task updates.
Lines 950 and 1011 update forester_epoch_detected and forester_epoch_registered gauges from spawned process_epoch tasks that run concurrently per epoch. The gauge setters (lines 236–244 in metrics.rs) use simple .set() semantics with no ordering protection—an older epoch's task finishing after a newer one will overwrite the metric with a lower value, causing gauges to move backward.
Replace the gauge updates with monotonic-max semantics using atomic compare-and-swap on an AtomicI64/AtomicU64, ensuring concurrent older epochs cannot regress the reported "latest" epoch value.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@forester/src/epoch_manager.rs` around lines 950 - 951, The spawned
process_epoch tasks call update_epoch_detected and update_epoch_registered which
currently call forester_epoch_detected.set(...) /
forester_epoch_registered.set(...) and allow older tasks to regress the metric;
change the gauge backing to a monotonic atomic (e.g., store an AtomicI64 or
AtomicU64 per gauge) and update the setter functions (the gauge setter helpers
used by update_epoch_detected/update_epoch_registered defined alongside
forester_epoch_detected/forester_epoch_registered) to perform a compare-and-swap
loop that only replaces the stored value when the new epoch is greater
(monotonic-max), then update the exported gauge from that atomic, ensuring
process_epoch concurrent completions cannot move the reported latest epoch
backward.
| // Determine if registration is currently open | ||
| let registration_is_open = protocol_config_pda | ||
| .config | ||
| .registration_phase_length | ||
| .saturating_sub(active_epoch_progress); | ||
| .is_registration_phase(slot) | ||
| .is_ok(); | ||
|
|
||
| // If registration is closed, show the next epoch as the registration target | ||
| let current_registration_epoch = if registration_is_open { | ||
| latest_register_epoch | ||
| } else { | ||
| latest_register_epoch + 1 | ||
| }; | ||
|
|
||
| let slots_until_next_registration = if registration_is_open { | ||
| // Slots until current registration closes | ||
| let epoch_progress = protocol_config_pda | ||
| .config | ||
| .get_latest_register_epoch_progress(slot) | ||
| .unwrap_or(0); | ||
| registration_phase_length.saturating_sub(epoch_progress) | ||
| } else { | ||
| // Slots until next epoch's registration opens | ||
| active_phase_length.saturating_sub( | ||
| protocol_config_pda | ||
| .config | ||
| .get_latest_register_epoch_progress(slot) | ||
| .unwrap_or(0), | ||
| ) | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate the registry module files
fd -t f "state.rs" programs/registry/src/protocol_config/Repository: Lightprotocol/light-protocol
Length of output: 118
🏁 Script executed:
# Inspect the is_registration_phase implementation
rg -n "pub fn is_registration_phase" --type rust -A 30Repository: Lightprotocol/light-protocol
Length of output: 2927
🏁 Script executed:
# Look at the protocol_config module structure
fd -t f "\.rs$" programs/registry/src/protocol_config/Repository: Lightprotocol/light-protocol
Length of output: 263
🏁 Script executed:
# Check for phase boundary definitions
rg -n "registration_phase_length|registration.*start|registration.*end" --type rust -B 2 -A 2Repository: Lightprotocol/light-protocol
Length of output: 37073
🏁 Script executed:
# Look for any CLAUDE.md or guidance docs
find . -name "CLAUDE.md" -o -name "claude.md" 2>/dev/nullRepository: Lightprotocol/light-protocol
Length of output: 718
🏁 Script executed:
# Check if get_epoch_phases is imported in forester_status.rs
head -100 forester/src/forester_status.rs | grep -E "^use|^mod"Repository: Lightprotocol/light-protocol
Length of output: 734
🏁 Script executed:
# Check the full import section and verify get_epoch_phases availability
rg -n "get_epoch_phases|use.*epoch" forester/src/forester_status.rs -B 2 -A 2 | head -40Repository: Lightprotocol/light-protocol
Length of output: 900
🏁 Script executed:
# Check how forester_epoch.rs is used in the forester module
rg -n "mod forester_epoch|use.*forester_epoch" forester/src/Repository: Lightprotocol/light-protocol
Length of output: 929
🏁 Script executed:
# Get the exact code at lines 170-210 in forester_status.rs to verify the diff
sed -n '170,210p' forester/src/forester_status.rsRepository: Lightprotocol/light-protocol
Length of output: 1588
🏁 Script executed:
# Also check what get_latest_register_epoch_progress returns and how it's used
rg -n "get_latest_register_epoch_progress" --type rust -B 2 -A 5 | head -50Repository: Lightprotocol/light-protocol
Length of output: 2698
🏁 Script executed:
# Verify the exact structure of EpochPhases and get_epoch_phases signature
rg -n "pub struct EpochPhases|pub fn get_epoch_phases" forester-utils/src/forester_epoch.rs -A 15 | head -60Repository: Lightprotocol/light-protocol
Length of output: 1489
🏁 Script executed:
# Check the latest_register_epoch to see if it's the current epoch to register for
rg -n "get_latest_register_epoch" programs/registry/src/protocol_config/state.rs -A 10 -B 2Repository: Lightprotocol/light-protocol
Length of output: 1885
Fix registration phase detection to use explicit slot boundaries instead of disabled API check.
Lines 175–176 call .is_ok() on is_registration_phase(slot), which always returns Ok(...) because the phase boundary validation is commented out (disabled intentionally per the code comment). This causes registration_is_open to be stuck true, which distorts both current_registration_epoch and slots_until_next_registration calculations.
The codebase already has the correct infrastructure (get_epoch_phases() imported at line 15) and uses explicit slot boundary checks throughout (e.g., epoch_manager.rs, forester_epoch.rs). Replace the .is_ok() check with direct boundary comparison:
Suggested fix
- // Determine if registration is currently open
- let registration_is_open = protocol_config_pda
- .config
- .is_registration_phase(slot)
- .is_ok();
+ let current_reg_phases = get_epoch_phases(&protocol_config_pda.config, latest_register_epoch);
+ let registration_is_open = slot >= current_reg_phases.registration.start
+ && slot < current_reg_phases.registration.end;
// If registration is closed, show the next epoch as the registration target
let current_registration_epoch = if registration_is_open {
latest_register_epoch
} else {
latest_register_epoch + 1
};
- let slots_until_next_registration = if registration_is_open {
- // Slots until current registration closes
- let epoch_progress = protocol_config_pda
- .config
- .get_latest_register_epoch_progress(slot)
- .unwrap_or(0);
- registration_phase_length.saturating_sub(epoch_progress)
- } else {
- // Slots until next epoch's registration opens
- active_phase_length.saturating_sub(
- protocol_config_pda
- .config
- .get_latest_register_epoch_progress(slot)
- .unwrap_or(0),
- )
- };
+ let slots_until_next_registration = if registration_is_open {
+ current_reg_phases.registration.end.saturating_sub(slot)
+ } else {
+ let next_reg_phases =
+ get_epoch_phases(&protocol_config_pda.config, latest_register_epoch + 1);
+ next_reg_phases.registration.start.saturating_sub(slot)
+ };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@forester/src/forester_status.rs` around lines 172 - 200, The code incorrectly
uses protocol_config_pda.config.is_registration_phase(slot).is_ok() (which
always returns Ok due to disabled validation) to set registration_is_open;
instead call the epoch-phase inspector to get concrete phase boundaries and
compare the slot against the registration phase range. Replace the is_ok() check
with a direct boundary test using
protocol_config_pda.config.get_epoch_phases(slot) (or get_epoch_phases for the
current/latest epoch), extract the registration phase start/end (or
registration_phase.range), set registration_is_open = slot >= registration_start
&& slot < registration_end, and then compute current_registration_epoch and
slots_until_next_registration by subtracting slot from the appropriate phase
end/start using latest_register_epoch, registration_phase_length,
active_phase_length and
protocol_config_pda.config.get_latest_register_epoch_progress(slot) as before.
Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests