Parent issue
Sub-issue of #111 (M7-3 — FeatureMatcher port).
Context
The BHC (BinaryHierarchicalClustering) implementation in Rust was ported in PR #114 (#110 / M7-2). During that work, we noted that the C++ k-medoids initialization uses vision::ArrayShuffle (from math/rand.h), while the Rust port uses a simple linear congruential generator (LCG) for shuffling.
The problem
The BHC tree is built by recursively running k-medoids on subsets of features. K-medoids picks initial centers by shuffling the indices and taking the first k. Because Rust and C++ use different shuffle algorithms, different initial centers are picked even with the same seed → different cluster assignments → different tree topology → different leaves visited at query time → different match pairs.
This means:
- The BHC index built by Rust is not equivalent to the one built by C++ on the same data.
- Match counts diverge between Rust
match_indexed and C++ match() with index, even with identical descriptors and a fixed seed.
Where this is currently visible
The dual-mode test dual_mode_match_indexed_within_tolerance in crates/core/src/kpm/freak/matcher.rs (added in PR #115) is weaker than the brute-force and guided variants:
| Variant |
Dual-mode check |
match_all (brute force) |
✅ Sorted pair equality + global-best invariant |
match_guided (homography) |
✅ Sorted pair equality |
match_indexed (BHC) |
⚠️ Count-only + maxima invariant (relaxed) |
The relaxed indexed test is documented in the test file:
// BHC RNG differs between C++ and Rust (documented in PR #114), so
// pair-level equality is NOT expected. ... To tighten this to
// pair-equality, port `ArrayShuffle` from math/rand.h.
Proposed fix
Port vision::ArrayShuffle from crates/core/third_party/WebARKitLib/lib/SRC/KPM/FreakMatcher/math/rand.h to Rust, replacing the LCG used in crates/core/src/kpm/freak/clustering.rs.
Steps
- Read
math/rand.h and document the algorithm (likely a Fisher–Yates variant with a specific PRNG).
- Port the PRNG (or use the existing C
rand()-equivalent if that's what the C++ uses).
- Replace the shuffle in
BinaryHierarchicalClustering and KMedoids initialization.
- Verify that with the same seed, Rust and C++ produce the same shuffled order on the same input.
- Strengthen
dual_mode_match_indexed_within_tolerance: replace the count-only check with sorted pair equality (matching the brute-force and guided tests).
Acceptance criteria
C++ source to port
crates/core/third_party/WebARKitLib/lib/SRC/KPM/FreakMatcher/math/rand.h
- Used in
kmedoids.h line 169: ArrayShuffle(&mRandIndices[0], (int)mRandIndices.size(), mK, mRandSeed);
Related
Parent issue
Sub-issue of #111 (M7-3 — FeatureMatcher port).
Context
The BHC (
BinaryHierarchicalClustering) implementation in Rust was ported in PR #114 (#110 / M7-2). During that work, we noted that the C++ k-medoids initialization usesvision::ArrayShuffle(frommath/rand.h), while the Rust port uses a simple linear congruential generator (LCG) for shuffling.The problem
The BHC tree is built by recursively running k-medoids on subsets of features. K-medoids picks initial centers by shuffling the indices and taking the first
k. Because Rust and C++ use different shuffle algorithms, different initial centers are picked even with the same seed → different cluster assignments → different tree topology → different leaves visited at query time → different match pairs.This means:
match_indexedand C++match()with index, even with identical descriptors and a fixed seed.Where this is currently visible
The dual-mode test
dual_mode_match_indexed_within_tolerancein crates/core/src/kpm/freak/matcher.rs (added in PR #115) is weaker than the brute-force and guided variants:match_all(brute force)match_guided(homography)match_indexed(BHC)The relaxed indexed test is documented in the test file:
Proposed fix
Port
vision::ArrayShufflefromcrates/core/third_party/WebARKitLib/lib/SRC/KPM/FreakMatcher/math/rand.hto Rust, replacing the LCG used incrates/core/src/kpm/freak/clustering.rs.Steps
math/rand.hand document the algorithm (likely a Fisher–Yates variant with a specific PRNG).rand()-equivalent if that's what the C++ uses).BinaryHierarchicalClusteringandKMedoidsinitialization.dual_mode_match_indexed_within_tolerance: replace the count-only check with sorted pair equality (matching the brute-force and guided tests).Acceptance criteria
ArrayShuffleon a fixed seed (unit test + dual-mode parity test)match_indexeddual-mode test asserts sorted pair equality with C++ baseline (assert_eq!(rust_pairs, cpp_pairs))match_all/match_guideddual-mode testsC++ source to port
crates/core/third_party/WebARKitLib/lib/SRC/KPM/FreakMatcher/math/rand.hkmedoids.hline 169:ArrayShuffle(&mRandIndices[0], (int)mRandIndices.size(), mK, mRandSeed);Related