Skip to content

perf(runtime): reduce timing bias by reordering timestamps#112

Merged
rocketman-code merged 2 commits intomainfrom
perf/calibration-bias
Feb 25, 2026
Merged

perf(runtime): reduce timing bias by reordering timestamps#112
rocketman-code merged 2 commits intomainfrom
perf/calibration-bias

Conversation

@rocketman-code
Copy link
Owner

Summary

  • Add calibration harness with busy-wait reference function that provides ground-truth durations from the same Instant clock Piano uses
  • Add bias measurement test that quantifies Piano's per-call timing error at multiple durations (100us, 10us, 1us, 100ns)
  • Reorder enter() so Instant::now() is captured after all bookkeeping (epoch, thread ID, alloc save, stack push)
  • Reorder Guard::drop() so Instant::now() is captured before all bookkeeping (thread ID check, alloc read, stack pop, alloc restore)

Median bias drops from ~166ns to ~42ns per call -- a 75% reduction. The residual ~42ns is the irreducible cost of two clock reads. Existing ratio accuracy tests still pass.

Test plan

  • cargo test --workspace passes
  • cargo clippy --workspace --all-targets -- -D warnings clean
  • Calibration harness self-validates reference function (inner/outer agree within 200ns)
  • Bias measurement confirms reduction: ~166ns baseline -> ~42ns after reorder
  • Existing accuracy suite (ratio tests) still passes

Shrink Guard from 56 bytes to 16 bytes (two registers) by replacing
Instant + ThreadId + alloc snapshot with a raw TSC tick and packed
thread-cookie/depth. Hot path (enter/drop) is now inlined with a
single rdtsc/cntvct_el0 instruction; all bookkeeping is split into
cold out-of-line functions.

New tsc module handles hardware counter reads, one-time calibration
(~2ms spin), and tick-to-nanosecond conversion via simplified ratio.

MSRV bumped 1.56 -> 1.59 for core::arch::asm! (inline assembly
stabilized in 1.59). Updated Cargo.toml, CI, docs, and MSRV
integration test accordingly.

Adds _test_internals feature to expose collect_invocations() for
external integration tests (calibration harness).
Three #[ignore] benchmarks for development use:
- reference_function_accuracy: validates busy_wait ground truth
- amortized_overhead: measures enter/drop cost per call over 1M iterations
- bias_empty_fn: measures reported time for empty function (pure bias)

Run with: cargo test -p piano-runtime --features _test_internals --test calibration -- --ignored --nocapture
@rocketman-code rocketman-code merged commit 32351c6 into main Feb 25, 2026
5 checks passed
@rocketman-code rocketman-code deleted the perf/calibration-bias branch February 25, 2026 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant