Improve create-path performance backend#10
Conversation
There was a problem hiding this comment.
Pull request overview
This PR brings ParPar-style fused hashing and new XOR-JIT Reed–Solomon create backends into par2rs’ create path, along with profiling/benchmark harnesses and expanded integration checks to compare performance and compatibility against par2cmdline-turbo/ParPar.
Changes:
- Add ParPar-style fused MD5x2 + CRC32 hashing infrastructure (including runtime backend dispatch and ARM64 scaffolding) plus optional ParPar C++ FFI for comparisons.
- Add XOR-JIT bitplane/exec-mem infrastructure and extend SIMD kernels (PSHUFB x4) to improve create-path throughput.
- Add create profiling hooks, scripts, benches, and additional integration tests (including duplicate-content + large block-size scenarios).
Reviewed changes
Copilot reviewed 57 out of 60 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_create_integration.rs | Adds integration tests for turbo-created duplicate-content sets and large explicit block size verification. |
| tests/md5x2_neon_placeholder.rs | Adds aarch64-only placeholder test for NEON MD5x2 state wiring. |
| tests/compare_with_parpar.rs | Adds feature-gated tests comparing Rust hashing results vs embedded ParPar via FFI. |
| src/verify/types.rs | Adds duplicate-block ambiguity detection to relax alignment checks when full-file hash matches. |
| src/verify/global_table.rs | Fixes get_file_blocks to include duplicates via iter_duplicates() and expands tests. |
| src/verify/global_engine.rs | Adjusts file-status logic to account for duplicate-content ambiguity; adds targeted unit test. |
| src/reed_solomon/simd/xor_jit/exec_mem.rs | Adds executable/mutable executable buffer management for XOR-JIT codegen. |
| src/reed_solomon/simd/xor_jit/bitplane.rs | Adds AVX2 bitplane prepare/finish and GF16 multiply-add helpers. |
| src/reed_solomon/simd/pshufb.rs | Refactors AVX2 loop for aligned/unaligned fast paths; adds x4 kernel + tests. |
| src/reed_solomon/simd/mod.rs | Exposes xor_jit module and re-exports xor-jit APIs on x86_64. |
| src/parpar_hasher/mod.rs | Introduces ParPar-style fused hasher module structure and backend layout. |
| src/parpar_hasher/md5x2_neon.rs | Adds aarch64 NEON MD5x2 backend implementation (with tests). |
| src/parpar_hasher/md5x2.rs | Adds shared MD5x2 backend trait contract. |
| src/parpar_hasher/hasher_input_dyn.rs | Adds runtime-dispatched HasherInput wrapper for selecting best backend. |
| src/parpar_hasher/hasher_input_arm64.rs | Adds aarch64 HasherInput driver using scalar MD5x2 + crc32fast CRC. |
| src/parpar_hasher/crc_clmul_avx512.rs | Adds AVX-512VL CRC folding variant for the fused hashing driver. |
| src/parpar_hasher/crc_armcrc.rs | Adds placeholder ARM CRC32 backend module. |
| src/parpar_hasher/ATTRIBUTION.md | Documents upstream ParPar/turbo sources and porting/decision rationale. |
| src/lib.rs | Exposes parpar_hasher and feature-gated ffi module at crate root. |
| src/ffi/wrapper.cpp | Adds C++ wrapper exposing ParPar MD5 and HasherInput over a C ABI. |
| src/ffi/mod.rs | Adds Rust-side safe wrappers for the C ABI (feature + x86_64 gated). |
| src/create/profile.rs | Adds create-path profiling phases/counters and env-gated CSV-ish emission. |
| src/create/mod.rs | Wires the create profiler module into create subsystem. |
| src/create/error.rs | Adds XOR-JIT checksum validation error variant. |
| src/checksum.rs | Adds chunked “fused” MD5+CRC update helpers and uses them in existing APIs. |
| scripts/turbo_dump_xorjit_body_avx2.c | Adds helper to dump turbo XOR-JIT code bodies for analysis. |
| scripts/turbo_dump_xor_prepare_packed_avx2.c | Adds helper to dump turbo packed prepare output for analysis. |
| scripts/turbo_dump_xor_finish_packed_avx2.c | Adds helper to dump turbo packed finish output for analysis. |
| scripts/profile_create_slow_paths.sh | Adds a perf+profiling harness for representative create workloads. |
| flake.nix | Adjusts dev-shell tools (valgrind moved under Linux-only tools list). |
| build.rs | Adds feature-gated ParPar C/C++ build steps for FFI comparisons. |
| benches/parpar_hasher_input.rs | Adds Criterion benches comparing naive/tier1/HasherInput variants. |
| benches/par2verify_compare.rs | Adds Criterion bench comparing par2rs vs turbo verify. |
| benches/md5x2_crc_fused.rs | Adds end-to-end fused hashing comparison bench variants. |
| benches/iai_simd.rs | Extends iai-callgrind benches with xor-jit bitplane kernels. |
| benches/iai_parpar_comparison.rs | Adds instruction-count comparison vs ParPar FFI (feature-gated). |
| benches/iai_par2verify_compare.rs | Adds iai-callgrind binary benches for par2rs vs turbo verify. |
| benches/iai_hasher_input.rs | Adds iai-callgrind benches for HasherInput vs helper vs naive. |
| benches/fused_hashing.rs | Adds microbench for tier1 fused hashing helpers. |
| benches/create_benchmark.rs | Fixes Criterion benchmark ID formatting (avoid % for gnuplot). |
| benches/crc_compare.rs | Adds benches comparing crc32fast vs crc-fast in relevant regimes. |
| benches/compare_with_parpar.rs | Adds wall-clock benches comparing Rust backends vs ParPar FFI. |
| benches/common/par2verify_fixture.rs | Adds reusable fixture/command plumbing for verify benches. |
| README.md | Updates benchmarking guidance and aligns license text to GPL-2.0-or-later. |
| Makefile | Updates benchmark-create-perf help text. |
| Cargo.toml | Switches license to GPL-2.0-or-later; adds parpar-compare feature, libc, cc build-dep, and new benches/dev-deps. |
| Cargo.lock | Updates lockfile for new dependencies (cc, libc, crc-fast, etc.). |
| .github/workflows/rust.yml | Adds cross-compile (aarch64) job and messaging about arch separation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a55ee3421b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 24b0682fe5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #10 +/- ##
=======================================
Coverage ? 90.01%
=======================================
Files ? 86
Lines ? 28312
Branches ? 0
=======================================
Hits ? 25486
Misses ? 2826
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f57a8c1a9e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
301168d to
a57fb30
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a57fb3023d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0612dc45d1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc1b413eb2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7fc6313da9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 61 out of 64 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/checksum.rs:176
- The docstring for
compute_md5_crc32_simultaneousstill claims this is a “single pass” / “reads data only once”, but the new implementation updates two independent hashers (MD5 then CRC32) over each sub-slice. That’s still two reads per chunk (just with better cache locality).
Please update the documentation to match the actual behavior (cache-resident chunking) so callers don’t assume it is instruction-level fused hashing.
/// Compute MD5 and CRC32 simultaneously in a single pass (par2cmdline style)
///
/// This is the most efficient way to compute both checksums as it:
/// - Reads data only once (50% less memory bandwidth)
/// - Processes data while still in CPU cache
/// - Updates both hash states in the same loop
///
/// Based on par2cmdline-turbo's MD5CRC_Calc implementation which showed
/// ~40-60% performance improvement over separate computation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 61 out of 64 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
95258b2 to
2537d40
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 982fa3456e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 61 out of 64 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 61 out of 64 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
HasherInputDynand portable SIMD supportNotes
par2-turbo-verifyfollow-up PRTesting