Skip to content

Improve create-path performance backend#10

Open
mjc wants to merge 16 commits into
mainfrom
create-performance-backend
Open

Improve create-path performance backend#10
mjc wants to merge 16 commits into
mainfrom
create-performance-backend

Conversation

@mjc
Copy link
Copy Markdown
Owner

@mjc mjc commented Apr 29, 2026

Summary

  • port the ParPar-style fused hashing backends into the create path, including HasherInputDyn and portable SIMD support
  • add the XOR JIT create backend work, profiling hooks, and benchmark harnesses used to analyze create hot paths
  • extend FFI, benches, and integration coverage so create-path performance can be compared against ParPar and turbo-style references

Notes

  • this is the base branch for the stacked par2-turbo-verify follow-up PR

Testing

  • not rerun in this session

Copilot AI review requested due to automatic review settings April 29, 2026 04:11
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR brings ParPar-style fused hashing and new XOR-JIT Reed–Solomon create backends into par2rs’ create path, along with profiling/benchmark harnesses and expanded integration checks to compare performance and compatibility against par2cmdline-turbo/ParPar.

Changes:

  • Add ParPar-style fused MD5x2 + CRC32 hashing infrastructure (including runtime backend dispatch and ARM64 scaffolding) plus optional ParPar C++ FFI for comparisons.
  • Add XOR-JIT bitplane/exec-mem infrastructure and extend SIMD kernels (PSHUFB x4) to improve create-path throughput.
  • Add create profiling hooks, scripts, benches, and additional integration tests (including duplicate-content + large block-size scenarios).

Reviewed changes

Copilot reviewed 57 out of 60 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_create_integration.rs Adds integration tests for turbo-created duplicate-content sets and large explicit block size verification.
tests/md5x2_neon_placeholder.rs Adds aarch64-only placeholder test for NEON MD5x2 state wiring.
tests/compare_with_parpar.rs Adds feature-gated tests comparing Rust hashing results vs embedded ParPar via FFI.
src/verify/types.rs Adds duplicate-block ambiguity detection to relax alignment checks when full-file hash matches.
src/verify/global_table.rs Fixes get_file_blocks to include duplicates via iter_duplicates() and expands tests.
src/verify/global_engine.rs Adjusts file-status logic to account for duplicate-content ambiguity; adds targeted unit test.
src/reed_solomon/simd/xor_jit/exec_mem.rs Adds executable/mutable executable buffer management for XOR-JIT codegen.
src/reed_solomon/simd/xor_jit/bitplane.rs Adds AVX2 bitplane prepare/finish and GF16 multiply-add helpers.
src/reed_solomon/simd/pshufb.rs Refactors AVX2 loop for aligned/unaligned fast paths; adds x4 kernel + tests.
src/reed_solomon/simd/mod.rs Exposes xor_jit module and re-exports xor-jit APIs on x86_64.
src/parpar_hasher/mod.rs Introduces ParPar-style fused hasher module structure and backend layout.
src/parpar_hasher/md5x2_neon.rs Adds aarch64 NEON MD5x2 backend implementation (with tests).
src/parpar_hasher/md5x2.rs Adds shared MD5x2 backend trait contract.
src/parpar_hasher/hasher_input_dyn.rs Adds runtime-dispatched HasherInput wrapper for selecting best backend.
src/parpar_hasher/hasher_input_arm64.rs Adds aarch64 HasherInput driver using scalar MD5x2 + crc32fast CRC.
src/parpar_hasher/crc_clmul_avx512.rs Adds AVX-512VL CRC folding variant for the fused hashing driver.
src/parpar_hasher/crc_armcrc.rs Adds placeholder ARM CRC32 backend module.
src/parpar_hasher/ATTRIBUTION.md Documents upstream ParPar/turbo sources and porting/decision rationale.
src/lib.rs Exposes parpar_hasher and feature-gated ffi module at crate root.
src/ffi/wrapper.cpp Adds C++ wrapper exposing ParPar MD5 and HasherInput over a C ABI.
src/ffi/mod.rs Adds Rust-side safe wrappers for the C ABI (feature + x86_64 gated).
src/create/profile.rs Adds create-path profiling phases/counters and env-gated CSV-ish emission.
src/create/mod.rs Wires the create profiler module into create subsystem.
src/create/error.rs Adds XOR-JIT checksum validation error variant.
src/checksum.rs Adds chunked “fused” MD5+CRC update helpers and uses them in existing APIs.
scripts/turbo_dump_xorjit_body_avx2.c Adds helper to dump turbo XOR-JIT code bodies for analysis.
scripts/turbo_dump_xor_prepare_packed_avx2.c Adds helper to dump turbo packed prepare output for analysis.
scripts/turbo_dump_xor_finish_packed_avx2.c Adds helper to dump turbo packed finish output for analysis.
scripts/profile_create_slow_paths.sh Adds a perf+profiling harness for representative create workloads.
flake.nix Adjusts dev-shell tools (valgrind moved under Linux-only tools list).
build.rs Adds feature-gated ParPar C/C++ build steps for FFI comparisons.
benches/parpar_hasher_input.rs Adds Criterion benches comparing naive/tier1/HasherInput variants.
benches/par2verify_compare.rs Adds Criterion bench comparing par2rs vs turbo verify.
benches/md5x2_crc_fused.rs Adds end-to-end fused hashing comparison bench variants.
benches/iai_simd.rs Extends iai-callgrind benches with xor-jit bitplane kernels.
benches/iai_parpar_comparison.rs Adds instruction-count comparison vs ParPar FFI (feature-gated).
benches/iai_par2verify_compare.rs Adds iai-callgrind binary benches for par2rs vs turbo verify.
benches/iai_hasher_input.rs Adds iai-callgrind benches for HasherInput vs helper vs naive.
benches/fused_hashing.rs Adds microbench for tier1 fused hashing helpers.
benches/create_benchmark.rs Fixes Criterion benchmark ID formatting (avoid % for gnuplot).
benches/crc_compare.rs Adds benches comparing crc32fast vs crc-fast in relevant regimes.
benches/compare_with_parpar.rs Adds wall-clock benches comparing Rust backends vs ParPar FFI.
benches/common/par2verify_fixture.rs Adds reusable fixture/command plumbing for verify benches.
README.md Updates benchmarking guidance and aligns license text to GPL-2.0-or-later.
Makefile Updates benchmark-create-perf help text.
Cargo.toml Switches license to GPL-2.0-or-later; adds parpar-compare feature, libc, cc build-dep, and new benches/dev-deps.
Cargo.lock Updates lockfile for new dependencies (cc, libc, crc-fast, etc.).
.github/workflows/rust.yml Adds cross-compile (aarch64) job and messaging about arch separation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/parpar_hasher/md5x2_neon.rs
Comment thread src/parpar_hasher/hasher_input_dyn.rs
Comment thread src/parpar_hasher/hasher_input_dyn.rs
Comment thread build.rs
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a55ee3421b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread build.rs
Repository owner deleted a comment from codecov Bot Apr 29, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24b0682fe5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/create/context.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 86.81145% with 668 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@6813f3c). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/parpar_hasher/hasher_input_arm64.rs 0.00% 246 Missing ⚠️
src/reed_solomon/simd/xor_jit/encoder.rs 91.05% 92 Missing ⚠️
src/reed_solomon/simd/xor_jit/exec_mem.rs 81.18% 73 Missing ⚠️
src/parpar_hasher/crc_clmul_avx512.rs 70.90% 64 Missing ⚠️
src/parpar_hasher/md5x2_avx512.rs 67.52% 63 Missing ⚠️
src/reed_solomon/simd/xor_jit/bitplane.rs 83.88% 54 Missing ⚠️
src/parpar_hasher/hasher_input_dyn.rs 88.29% 24 Missing ⚠️
src/create/context.rs 97.85% 14 Missing ⚠️
src/reed_solomon/simd/pshufb.rs 93.20% 14 Missing ⚠️
src/checksum.rs 50.00% 12 Missing ⚠️
... and 3 more
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main      #10   +/-   ##
=======================================
  Coverage        ?   90.01%           
=======================================
  Files           ?       86           
  Lines           ?    28312           
  Branches        ?        0           
=======================================
  Hits            ?    25486           
  Misses          ?     2826           
  Partials        ?        0           
Flag Coverage Δ
unittests 90.01% <86.81%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/create/backend.rs 90.03% <ø> (ø)
src/create/error.rs 100.00% <100.00%> (ø)
src/create/mod.rs 100.00% <100.00%> (ø)
src/create/profile.rs 100.00% <100.00%> (ø)
src/create/progress.rs 100.00% <100.00%> (ø)
src/domain.rs 94.21% <100.00%> (ø)
src/lib.rs 100.00% <ø> (ø)
src/parpar_hasher/md5x2_scalar.rs 100.00% <100.00%> (ø)
src/parpar_hasher/md5x2_sse2.rs 100.00% <100.00%> (ø)
src/reed_solomon/simd/mod.rs 85.71% <ø> (ø)
... and 19 more

Impacted file tree graph

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f57a8c1a9e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread build.rs Outdated
@mjc mjc force-pushed the create-performance-backend branch from 301168d to a57fb30 Compare April 29, 2026 07:57
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a57fb3023d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/ffi/wrapper.cpp Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0612dc45d1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/create/backend.rs Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc1b413eb2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/create/context.rs
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7fc6313da9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/parpar_hasher/hasher_input_dyn.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 61 out of 64 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/checksum.rs:176

  • The docstring for compute_md5_crc32_simultaneous still claims this is a “single pass” / “reads data only once”, but the new implementation updates two independent hashers (MD5 then CRC32) over each sub-slice. That’s still two reads per chunk (just with better cache locality).

Please update the documentation to match the actual behavior (cache-resident chunking) so callers don’t assume it is instruction-level fused hashing.

/// Compute MD5 and CRC32 simultaneously in a single pass (par2cmdline style)
///
/// This is the most efficient way to compute both checksums as it:
/// - Reads data only once (50% less memory bandwidth)
/// - Processes data while still in CPU cache
/// - Updates both hash states in the same loop
///
/// Based on par2cmdline-turbo's MD5CRC_Calc implementation which showed
/// ~40-60% performance improvement over separate computation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/reed_solomon/simd/xor_jit/exec_mem.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 61 out of 64 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/reed_solomon/simd/xor_jit/exec_mem.rs
Comment thread src/reed_solomon/simd/xor_jit/exec_mem.rs Outdated
Comment thread src/verify/types.rs
Comment thread src/parpar_hasher/hasher_input_dyn.rs
@mjc mjc force-pushed the create-performance-backend branch from 95258b2 to 2537d40 Compare April 29, 2026 17:24
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 982fa3456e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/ffi/mod.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 61 out of 64 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/compare_with_parpar.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 61 out of 64 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/parpar_hasher/hasher_input_dyn.rs
Comment thread src/reed_solomon/simd/pshufb.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants