Fix flaky Enzyme test_forward/test_reverse tolerance (RNG-dependent vs fastpower approximation) by ChrisRackauckas-Claude · Pull Request #58 · SciML/FastPower.jl

ChrisRackauckas-Claude · 2026-06-23T20:38:37Z

Please ignore until reviewed by @ChrisRackauckas.

Problem

tests / Enzyme (julia 1) on main went red on the v1.3.2 run (was green 8 days earlier, with identical test source). The failure:

test_forward: fastpower with return activity Duplicated on (::Float64, Duplicated), (::Float64, Const): Test Failed
  Expression: isapprox(x, y; kwargs...)
   Evaluated: isapprox(0.155, 0.15524589105604497; atol = 0.0001, rtol = 0.001)

Root cause

FastPower's Enzyme @easy_rule returns the exact ^ derivative (y*fastpower(x,y-1), Ω*log(x)). EnzymeTestUtils' test_forward/test_reverse compare that rule against finite differences of the deliberately-approximate fastpower primal. So the measured gap is exactly fastpower's own primal approximation error (~1e-3 relative — the same envelope asserted in test/fast_pow_tests.jl), which sat right on top of the old atol=1e-4, rtol=1e-3.

Whether the lane passed depended on the random perturbation test_forward drew from the global RNG. An analytic sweep over the tangent grid the test samples (tangents in -9:0.01:9, central FD-5, at x=1.0, y=0.5) shows:

config	worst abs gap	worst rel gap	fail @ old tol (1e-4,1e-3)
Tx=Dup, Ty=Const	1.17e-3	2.4%	144080/3241800 (4.44%)
Tx=Dup, Ty=Dup	2.04e-3	—	111058/3243600 (3.42%)
Tx=Const, Ty=Dup	6e-14	—	0

The CI failing value 0.15524589105604497 is reproduced exactly at tangent dx=0.3105 (= the exact-^ derivative 0.5·dx), with the FD-of-fastpower reference at 0.1555 — i.e. it is fastpower's primal error, not a wrong rule.

Fix

Seed the RNG (Random.Xoshiro(0)) so the randomized test is reproducible.
Raise the tolerance to atol=1e-3, rtol=1e-2, consistent with fastpower's documented accuracy. This has zero failures across all ~6.5M tangent draws in the grid, while a genuinely wrong rule would still be off by O(1) relative and is not masked.
Add Random to the test [extras]/[targets] and a Random = "1" [compat] entry.

This is the principled fix, not a blanket tolerance loosening: the rule's true error is zero; the only thing being measured is the approximation built into fastpower itself.

Verification (run locally)

Deps resolved match CI: Enzyme 0.13.164, EnzymeTestUtils 0.2.8, FiniteDifferences 0.12.34.

Reproduced the failure through the real test_forward with rng=Xoshiro(16) (first tangent dx=0.31): old tolerance → 6 pass / 1 fail (matches CI); new tolerance → 7 pass / 0 fail.
Fixed Enzyme group via Pkg.test, julia 1.11: enzyme_forward_tests 52/52, enzyme_reverse_tests 36/36, tests passed.
Fixed Enzyme group via Pkg.test, julia lts (1.10): 52/52, 36/36, tests passed.
Seeded forward test is deterministic: 52/52 across 3 repeats.
Runic: clean (no diff) on both edited files.

Note on the other red lanes in the same run

tests / Core (julia 1) and tests / Core (julia lts) were red in the same run but are not code failures: both ran on self-hosted-4vcpu-8gb (smcsd) runners squatting on the ubuntu-latest label; the "Run tests" step emitted zero log output and never recorded a conclusion (runner OOM/lost-communication while precompiling the Mooncake+Enzyme+ReverseDiff stack in 8 GB). Locally the Core group passes cleanly (fast_log2 1200/1200, fast_pow 5/5, other_ad_engines 4/4, all AD-engine derivative comparisons rel=0.0) on both julia 1.11 and lts. That is a runner-capacity infra issue, out of scope for this PR.

🤖 Generated with Claude Code

…roximation The Enzyme `@easy_rule` returns the exact `^` derivative, but EnzymeTestUtils `test_forward`/`test_reverse` compare it against finite differences of the *approximate* `fastpower` primal. The measured gap is therefore `fastpower`'s own primal approximation error (~1e-3 relative, the same envelope asserted in test/fast_pow_tests.jl), which sat right on top of the previous atol=1e-4, rtol=1e-3. Whether the lane passed depended on the random perturbation drawn from the global RNG: an analytic sweep over the tangent grid the test samples shows ~4.4% of draws exceed the old tolerance, so the lane went red intermittently (green 8 days ago, red on the v1.3.2 run, green locally). Seed the RNG (Random.Xoshiro(0)) for reproducibility and raise the tolerance to atol=1e-3, rtol=1e-2, consistent with fastpower's documented accuracy. The new tolerance has zero failures across all ~6.5M tangent draws in the grid, while a genuinely wrong rule would still be off by O(1) relative and is not masked. Add Random to the test extras/targets and [compat]. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix flaky Enzyme test_forward/test_reverse tolerance (RNG-dependent vs fastpower approximation)#58

Fix flaky Enzyme test_forward/test_reverse tolerance (RNG-dependent vs fastpower approximation)#58
ChrisRackauckas-Claude wants to merge 1 commit into
SciML:mainfrom
ChrisRackauckas-Claude:fix-enzyme-tolerance-rng-flake

ChrisRackauckas-Claude commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ChrisRackauckas-Claude commented Jun 23, 2026

Problem

Root cause

Fix

Verification (run locally)

Note on the other red lanes in the same run

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants