Skip to content

feat(simd_soa): iter_i32x16 / iter_i64x8 typed lane iterators on MultiLaneColumn#228

Merged
AdaWorldAPI merged 1 commit into
masterfrom
claude/v3-substrate-migration-review-o0yoxv
Jul 2, 2026
Merged

feat(simd_soa): iter_i32x16 / iter_i64x8 typed lane iterators on MultiLaneColumn#228
AdaWorldAPI merged 1 commit into
masterfrom
claude/v3-substrate-migration-review-o0yoxv

Conversation

@AdaWorldAPI

Copy link
Copy Markdown
Owner

Adds the signed-integer lane views the gridlake batch SoA needs, unblocking the lance-graph lane-J wiring flagged in that probe's COMMENTARY.

Why

MultiLaneColumn (the gridlake carrier — see examples/onebrc_cascade_probe.rs) exposed only f32x16 / f64x8 / u64x8 / u8x64 lane views. lance-graph's gridlake batch SoA carries i32 min/max and i64 sum columns; #227's onebrc probe only got away without these because it used f32 min/max columns. Without signed integer lanes, a consumer batch SoA can't be viewed through the carrier directly.

Change

  • i32x16_from_chunk / i64x8_from_chunk — little-endian decoders mirroring the existing f32x16_from_chunk / u64x8_from_chunk (scalar from_le_bytes loop, lowered to a single register-width load on LE targets; no pointer-cast of the u8-aligned Arc<[u8]>).
  • iter_i32x16 / iter_i64x8 methods + len_i32x16 / len_i64x8, routed through crate::simd::{I32x16, I64x8} per the W1a layering rule (never dipping into simd_avx512/simd_neon/scalar directly).
  • Parity tests: iter_i32x16_le_round_trip (includes negatives — proves sign-extension survives the LE decode) + iter_i64x8_le_round_trip; extended the empty-count, 3-lane-count, and len asserts.

These are layout-only zero-copy reinterpretations of the backing store (the same category as the existing typed iterators), not new compute kernels — no per-arch AVX/NEON/scalar backend needed beyond the lane types crate::simd already provides.

Verification

  • simd_soa module: 13/13 tests pass under the v3 default; library builds clean under --config .cargo/config-native.toml (real AVX-512 backend on an avx512f host) — so the iterators light up the actual 512-bit zmm path, not just the AVX2-halves fallback.
  • clippy -D warnings clean; fmt clean.

🤖 Generated with Claude Code

https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM


Generated by Claude Code

…iLaneColumn

Follow-up unblocking the gridlake wiring (lance-graph #635 COMMENTARY):
lane J's GridBatch carries i32 min/max and i64 sum columns, but
MultiLaneColumn only exposed f32/f64/u64/u8 lane views — #227's onebrc
gridlake probe got away with f32 min/max columns. Add the signed integer
lane widths so a batch SoA can be viewed through the gridlake carrier
directly, no f32 recast.

- `i32x16_from_chunk` / `i64x8_from_chunk` — LE decoders mirroring the
  existing `f32x16_from_chunk` / `u64x8_from_chunk` (scalar `from_le_bytes`
  loop, lowered to a single register-width load on LE targets; no pointer
  cast of the u8-aligned Arc<[u8]>).
- `iter_i32x16` / `iter_i64x8` methods + `len_i32x16` / `len_i64x8`,
  routed through `crate::simd::{I32x16, I64x8}` per the W1a layering rule
  (never dipping into simd_avx512/simd_neon/scalar directly).
- Parity tests: `iter_i32x16_le_round_trip` (incl. negatives, proves
  sign-extension survives the decode) + `iter_i64x8_le_round_trip`;
  extended the empty-count, 3-lane-count, and len asserts.

These are layout-only zero-copy reinterpretations of the backing store
(the same category as the existing typed iterators), not new compute
kernels — no per-arch AVX/NEON/scalar backend needed beyond the lane
types crate::simd already provides.

simd_soa: 13/13 tests pass; clippy -D warnings clean; fmt clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@AdaWorldAPI, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 20 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4ec4f27c-1d6c-44c1-b743-abf8282162e5

📥 Commits

Reviewing files that changed from the base of the PR and between de72d15 and ac59b1d.

📒 Files selected for processing (1)
  • src/simd_soa.rs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@AdaWorldAPI AdaWorldAPI merged commit ffb12fd into master Jul 2, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants