perf(runtime): reduce timing bias by reordering timestamps#112
Merged
rocketman-code merged 2 commits intomainfrom Feb 25, 2026
Merged
perf(runtime): reduce timing bias by reordering timestamps#112rocketman-code merged 2 commits intomainfrom
rocketman-code merged 2 commits intomainfrom
Conversation
8a8baeb to
8b177ad
Compare
This was referenced Feb 25, 2026
Shrink Guard from 56 bytes to 16 bytes (two registers) by replacing Instant + ThreadId + alloc snapshot with a raw TSC tick and packed thread-cookie/depth. Hot path (enter/drop) is now inlined with a single rdtsc/cntvct_el0 instruction; all bookkeeping is split into cold out-of-line functions. New tsc module handles hardware counter reads, one-time calibration (~2ms spin), and tick-to-nanosecond conversion via simplified ratio. MSRV bumped 1.56 -> 1.59 for core::arch::asm! (inline assembly stabilized in 1.59). Updated Cargo.toml, CI, docs, and MSRV integration test accordingly. Adds _test_internals feature to expose collect_invocations() for external integration tests (calibration harness).
Three #[ignore] benchmarks for development use: - reference_function_accuracy: validates busy_wait ground truth - amortized_overhead: measures enter/drop cost per call over 1M iterations - bias_empty_fn: measures reported time for empty function (pure bias) Run with: cargo test -p piano-runtime --features _test_internals --test calibration -- --ignored --nocapture
8b177ad to
2db4be4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Instantclock Piano usesenter()soInstant::now()is captured after all bookkeeping (epoch, thread ID, alloc save, stack push)Guard::drop()soInstant::now()is captured before all bookkeeping (thread ID check, alloc read, stack pop, alloc restore)Median bias drops from ~166ns to ~42ns per call -- a 75% reduction. The residual ~42ns is the irreducible cost of two clock reads. Existing ratio accuracy tests still pass.
Test plan
cargo test --workspacepassescargo clippy --workspace --all-targets -- -D warningsclean