Skip to content

Conversation

@moritz-gross
Copy link
Contributor

@moritz-gross moritz-gross commented Jan 11, 2026

while working with LazyLock, I saw that UPPER_ROMAN_NUMERAL and LOWER_ROMAN_NUMERAL have two matching operators ^ at the start, making them not match leading whitespace. Is that the correct behaviour?
The comment in the code sounds like they aren't really needed generally though.

in the added test cases, the ones with leading whitespace are failing (cargo test test_roman_numeral_regex)

Copy link
Collaborator

@NSoiffer NSoiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When they get used, they have been trimmed. If they appear in some text string, which might include a space, then they shouldn't match.

I thought some braille codes I started with did something special for roman numerals, but they didn't so I think that is what the comment is about. However, I see at least Finnish braille cares. Looking at the code, I see several places where intended roman numerals get cleaned up (three variables V I I which should be a single token VII). Also, they are used in chemistry.

@moritz-gross moritz-gross requested a review from NSoiffer January 12, 2026 14:14
@moritz-gross
Copy link
Contributor Author

ok sounds good, do we still want to add this test case, also as a way to document the expected behaviour?

@NSoiffer
Copy link
Collaborator

There are higher level tests "roman_numeral" and "not_roman_numeral", so I'm not sure these are needed. It doesn't hurt have more tests. However, they do need to be adjusted.

@moritz-gross
Copy link
Contributor Author

let's work on other things for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants