Open
Conversation
Create OFFLINE-VECTORIZED NOTES/LOG/TODO files under agents/ to track the new offline vectorized-index porting work. The notes capture scope and constraints for the first phase: - Start with Mulshrolate1RX. - Implement AVX2 Index32x8 and AVX-512 Index32x16. - Preserve scalar fallback and correctness across varying TABLE_DATA widths. The TODO defines concrete implementation, RawCString synchronization, validation, and commit checkpoints. The log records initial repository audit findings and the selected technical approach.
Introduce offline generated Index32x8 and Index32x16 entry points for\nMulshrolate1RX using x86 intrinsics in the non-inline C path:\n\n- Add routine naming macros and capability defines so generated test paths can\n conditionally compile vector checks.\n- Add AVX2 x8 and AVX-512 x16 arithmetic helpers (mul/rotate/shift), with\n scalar lane-wise table lookups retained for TABLE_DATA type safety.\n- Add scalar fallback helpers and runtime CPU gating for GCC/Clang target\n attribute builds.\n- Keep CPH_INLINE_ROUTINES behavior intact by emitting these routines only in\n the non-inline section.\n\nExpand generated-test coverage:\n\n- Add optional extern declarations for Index32x8/Index32x16 in\n CompiledPerfectHashTableTest.c.\n- Validate first 8/16 keys against scalar INDEX_ROUTINE when the routines are\n available.\n\nFixes folded into this commit:\n\n- Remove redundant 'static' from FORCEINLINE helpers to avoid duplicate\n storage-class errors in generated GCC builds.\n- Correct IACA variant for Mulshrolate1RX by applying the missing\n Vertex1 >>= SEED3_BYTE1 stage before table lookup.\n\nSync RawCString payload headers with template updates so codegen output uses\nthe new routines and test validation.
Update offline vectorization tracking docs after implementation and\nend-to-end validation.\n\n- LOG: append implementation details, codegen test failure/fix notes, rebuild\n status, GCC/Clang generated-project validation, and scalar/vector benchmark\n outputs.\n- TODO: mark completed work items for Mulshrolate1RX vector routines, test\n integration, RawCString synchronization, and validation steps; keep remaining\n benchmark-template work open.\n- NOTES: capture current-state summary and first benchmark observations for\n follow-on optimization/tuning.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.