Skip to content

feat(kpm): port ArrayShuffle for BHC parity in dual-mode tests #116

@kalwalt

Description

@kalwalt

Parent issue

Sub-issue of #111 (M7-3 — FeatureMatcher port).

Context

The BHC (BinaryHierarchicalClustering) implementation in Rust was ported in PR #114 (#110 / M7-2). During that work, we noted that the C++ k-medoids initialization uses vision::ArrayShuffle (from math/rand.h), while the Rust port uses a simple linear congruential generator (LCG) for shuffling.

The problem

The BHC tree is built by recursively running k-medoids on subsets of features. K-medoids picks initial centers by shuffling the indices and taking the first k. Because Rust and C++ use different shuffle algorithms, different initial centers are picked even with the same seed → different cluster assignmentsdifferent tree topologydifferent leaves visited at query timedifferent match pairs.

This means:

  • The BHC index built by Rust is not equivalent to the one built by C++ on the same data.
  • Match counts diverge between Rust match_indexed and C++ match() with index, even with identical descriptors and a fixed seed.

Where this is currently visible

The dual-mode test dual_mode_match_indexed_within_tolerance in crates/core/src/kpm/freak/matcher.rs (added in PR #115) is weaker than the brute-force and guided variants:

Variant Dual-mode check
match_all (brute force) ✅ Sorted pair equality + global-best invariant
match_guided (homography) ✅ Sorted pair equality
match_indexed (BHC) ⚠️ Count-only + maxima invariant (relaxed)

The relaxed indexed test is documented in the test file:

// BHC RNG differs between C++ and Rust (documented in PR #114), so
// pair-level equality is NOT expected. ... To tighten this to
// pair-equality, port `ArrayShuffle` from math/rand.h.

Proposed fix

Port vision::ArrayShuffle from crates/core/third_party/WebARKitLib/lib/SRC/KPM/FreakMatcher/math/rand.h to Rust, replacing the LCG used in crates/core/src/kpm/freak/clustering.rs.

Steps

  1. Read math/rand.h and document the algorithm (likely a Fisher–Yates variant with a specific PRNG).
  2. Port the PRNG (or use the existing C rand()-equivalent if that's what the C++ uses).
  3. Replace the shuffle in BinaryHierarchicalClustering and KMedoids initialization.
  4. Verify that with the same seed, Rust and C++ produce the same shuffled order on the same input.
  5. Strengthen dual_mode_match_indexed_within_tolerance: replace the count-only check with sorted pair equality (matching the brute-force and guided tests).

Acceptance criteria

  • Rust shuffle produces byte-identical output to C++ ArrayShuffle on a fixed seed (unit test + dual-mode parity test)
  • match_indexed dual-mode test asserts sorted pair equality with C++ baseline (assert_eq!(rust_pairs, cpp_pairs))
  • All M7 unit tests still pass
  • No regression in match_all / match_guided dual-mode tests

C++ source to port

  • crates/core/third_party/WebARKitLib/lib/SRC/KPM/FreakMatcher/math/rand.h
  • Used in kmedoids.h line 169: ArrayShuffle(&mRandIndices[0], (int)mRandIndices.size(), mK, mRandSeed);

Related

Metadata

Metadata

Assignees

Type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions