fix(lexer): support size_t literal, separator chars, 'module'#241
fix(lexer): support size_t literal, separator chars, 'module'#241APokorny wants to merge 1 commit intoboostorg:developfrom
Conversation
The integer parsing grammars still have to be updated.
|
A few days ago I added user defined and standard library literal tokens. re2c works fine, but I ran into issues with xpressive. I hope I can get back to it soon. Should changes for that go into the same .. or a different branch? |
|
I withdraw my previous comment :) I think we should put all related functional changes into the same PR and handle all the various lexers in parallel. I do have a mild preference for separating features (keywords, number separators, literals), into separate commits/PRs, but if they are logically related it's fine. Any thoughts on the CI failures? |
|
I see you have also upgraded re2c from 1.0.2 to 4.1. Actually I'm very interested in trying newer versions of re2c, but I think we should make that a separate PR (and also record what we expect to get in terms of improvements etc.). Can you build with 1.0.2 for now? |
| #define HEXDIGIT "[0-9a-fA-F]" | ||
| #define OPTSIGN "[-+]?" | ||
| #define BINARYDIGIT "[01]" | ||
| #define SIGN "[-+]?" |
There was a problem hiding this comment.
Minor, but can we leave this as OPTSIGN? More exact, and also this SIGN doesn't line up with the other strings
|
We should also think a bit about language version support here also. Digit separators were introduced in C++14, which we don't have a separate flag for, but you could argue "0x" would be appropriate at least. size_t literals are C++23 and we don't have a flag for that yet. |
- Fix RE2C code for numbers (binary and digit separators) - Revert to RE2C version 1.0.2, for now - Revamp token ids to minimize changes - Restore existing and more accurate name OPTSIGN in slex - Add binary literal support to lexertl - Fix xlex support for size_t literals - Add test tokens for octal, binary, and hex literals
|
Thats great! - I was pulled off into other projects and had no more time working on this in the last few weeks. |
|
Closed in favor of the related #242 which is a superset |
- Fix RE2C code for numbers (binary and digit separators) - Revert to RE2C version 1.0.2, for now - Revamp token ids to minimize changes - Restore existing and more accurate name OPTSIGN in slex - Add binary literal support to lexertl - Fix xlex support for size_t literals - Add test tokens for octal, binary, and hex literals
This extends the lexers with some of the things that seemed to be missing.
For the separator chars in number literals ', the spirit parsers would need to be adjusted.
I do not remember, whether that could be done by wrapping a scanner or by replacing the parsers..
Additionally I was not sure, at what point of wave these spirit grammars play a role.
After that I am curious about user defined literals...