feat(simd_soa): iter_i32x16 / iter_i64x8 typed lane iterators on MultiLaneColumn#228
Conversation
…iLaneColumn Follow-up unblocking the gridlake wiring (lance-graph #635 COMMENTARY): lane J's GridBatch carries i32 min/max and i64 sum columns, but MultiLaneColumn only exposed f32/f64/u64/u8 lane views — #227's onebrc gridlake probe got away with f32 min/max columns. Add the signed integer lane widths so a batch SoA can be viewed through the gridlake carrier directly, no f32 recast. - `i32x16_from_chunk` / `i64x8_from_chunk` — LE decoders mirroring the existing `f32x16_from_chunk` / `u64x8_from_chunk` (scalar `from_le_bytes` loop, lowered to a single register-width load on LE targets; no pointer cast of the u8-aligned Arc<[u8]>). - `iter_i32x16` / `iter_i64x8` methods + `len_i32x16` / `len_i64x8`, routed through `crate::simd::{I32x16, I64x8}` per the W1a layering rule (never dipping into simd_avx512/simd_neon/scalar directly). - Parity tests: `iter_i32x16_le_round_trip` (incl. negatives, proves sign-extension survives the decode) + `iter_i64x8_le_round_trip`; extended the empty-count, 3-lane-count, and len asserts. These are layout-only zero-copy reinterpretations of the backing store (the same category as the existing typed iterators), not new compute kernels — no per-arch AVX/NEON/scalar backend needed beyond the lane types crate::simd already provides. simd_soa: 13/13 tests pass; clippy -D warnings clean; fmt clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
|
Warning Review limit reached
Next review available in: 20 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Adds the signed-integer lane views the gridlake batch SoA needs, unblocking the lance-graph lane-J wiring flagged in that probe's COMMENTARY.
Why
MultiLaneColumn(the gridlake carrier — seeexamples/onebrc_cascade_probe.rs) exposed onlyf32x16/f64x8/u64x8/u8x64lane views. lance-graph's gridlake batch SoA carries i32 min/max and i64 sum columns; #227's onebrc probe only got away without these because it used f32 min/max columns. Without signed integer lanes, a consumer batch SoA can't be viewed through the carrier directly.Change
i32x16_from_chunk/i64x8_from_chunk— little-endian decoders mirroring the existingf32x16_from_chunk/u64x8_from_chunk(scalarfrom_le_bytesloop, lowered to a single register-width load on LE targets; no pointer-cast of theu8-alignedArc<[u8]>).iter_i32x16/iter_i64x8methods +len_i32x16/len_i64x8, routed throughcrate::simd::{I32x16, I64x8}per the W1a layering rule (never dipping intosimd_avx512/simd_neon/scalardirectly).iter_i32x16_le_round_trip(includes negatives — proves sign-extension survives the LE decode) +iter_i64x8_le_round_trip; extended the empty-count, 3-lane-count, andlenasserts.These are layout-only zero-copy reinterpretations of the backing store (the same category as the existing typed iterators), not new compute kernels — no per-arch AVX/NEON/scalar backend needed beyond the lane types
crate::simdalready provides.Verification
simd_soamodule: 13/13 tests pass under the v3 default; library builds clean under--config .cargo/config-native.toml(real AVX-512 backend on anavx512fhost) — so the iterators light up the actual 512-bitzmmpath, not just the AVX2-halves fallback.-D warningsclean; fmt clean.🤖 Generated with Claude Code
https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
Generated by Claude Code