Normalize Unicode dashes in slug generation#613
Open
marcodejongh wants to merge 1 commit intomainfrom
Open
Conversation
Non-ASCII dash characters (en dash, em dash, non-breaking hyphen, minus sign) in database content were being stripped by the slug generation regex instead of being converted to ASCII hyphens. This caused slug mismatches like "rows-kb1-kb2-118-columns-ak" instead of the expected "rows-kb1-kb2-1-18-columns-a-k" when the database contained Unicode dashes in size descriptions. Added a normalizeDashes() helper function that converts various Unicode dash characters to standard ASCII hyphens before slug processing. This is applied to all slug generation functions: generateClimbSlug, generateSlugFromText, generateDescriptionSlug, and generateLayoutSlug. Also refactored getLayoutBySlug in slug-utils.ts to use the shared generateLayoutSlug helper for consistency. Resolves BOARDSESH-2
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Claude Review✅ Ready to merge - Minor issues noted below, but nothing blocking. Issues
Suggested Test Additiondescribe('Unicode dash normalization', () => {
it('should normalize en dash to hyphen in climb slug', () => {
expect(generateClimbSlug('Climb – Name')).toBe('climb-name');
});
it('should normalize em dash to hyphen', () => {
expect(generateSlugFromText('Test — Value')).toBe('test-value');
});
it('should normalize minus sign to hyphen', () => {
expect(generateLayoutSlug('Layout − Name')).toBe('layout-name');
});
}); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR extracts and centralizes Unicode dash normalization logic across slug generation functions to ensure consistent handling of various dash-like characters (en dash, em dash, minus sign, etc.) that may appear in user input.
Key Changes
normalizeDashes()helper function inurl-utils.tsthat converts Unicode dash variants (U+2010–U+2015, U+2212, U+FE58, U+FE63, U+FF0D) to ASCII hyphensgenerateClimbSlug()generateSlugFromText()generateDescriptionSlug()generateLayoutSlug()getLayoutBySlug()inslug-utils.tsto use the newgenerateLayoutSlug()helper, removing ~20 lines of duplicated slug generation logicgenerateLayoutSlug()fromurl-utils.tsfor reuse across the codebaseImplementation Details
The
normalizeDashes()function uses a regex pattern to replace all common Unicode dash/hyphen variants with a standard ASCII hyphen (-). This ensures that whether users paste content with en dashes, em dashes, or other Unicode variants, the resulting slugs are consistent and predictable.This change improves maintainability by centralizing slug generation logic and prevents potential slug mismatches caused by different dash character encodings.