Conversation
…o support optional semicolons and the `GO` keyword.
- Resolve grammar conflicts for statement vs _tsql_function_body_statement - Enable CREATE PROCEDURE parsing with T-SQL parameter support (@param) - Add SET NOCOUNT ON/OFF support - Fix DECLARE TABLE with T-SQL variables (@tablevar) - Allow INSERT into table variables (@user) - Fix UPDATE...FROM with table variables and JOIN precedence - Add relation to accept $.parameter for T-SQL table vars - Create queries/structure.scm for code navigation (procedures, tables, variables)
There was a problem hiding this comment.
Pull request overview
This PR updates the SQL tree-sitter grammar to improve T-SQL support (procedures, identifiers/parameters, and additional statements) and adds new corpus/queries to validate and enable code navigation.
Changes:
- Add/adjust T-SQL grammar support (procedures,
GO,SET NOCOUNT, variables/parameters, bracket identifiers, statement forms). - Add a new procedure corpus (
test/corpus/procedures.txt) and a structure query file for code navigation. - Refactor/remove prior transaction parsing and tweak several statement grammars (e.g.,
SET,SELECT,TRUNCATE,DROP).
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| test_tsql_select.sql | Adds a basic T-SQL select sample using bracket identifiers and @ parameters. |
| test_simple.sql | Adds a procedure sample, currently as a quoted string literal. |
| test_semi.sql | Adds a procedure sample with semicolon, currently as a quoted string literal. |
| test_ref.sql | Adds a reference sample, currently as a quoted string literal. |
| test_proc_final.sql | Adds a T-SQL procedure sample, currently as a quoted string literal. |
| test/corpus/procedures.txt | New corpus coverage for procedure parsing across dialects (Postgres + T-SQL). |
| queries/structure.scm | Adds structure queries for procedures/functions/tables/table variables. |
| grammar/transactions.js | Removes the transaction rule set (now empty). |
| grammar/statements/truncate.js | Requires at least one truncate target. |
| grammar/statements/set.js | Expands T-SQL-related SET support (e.g., NOCOUNT) and adjusts assignment parsing. |
| grammar/statements/select.js | Updates term parsing (including T-SQL alias = expr) and join associativity tweaks. |
| grammar/statements/insert.js | Allows insert target to be an object reference or parameter. |
| grammar/statements/index.js | Adds multiple T-SQL statement forms (IF/RETURN/PRINT/RAISERROR/EXEC/GOTO/labels/transaction statements) and semicolon flexibility. |
| grammar/statements/drop.js | Adds DROP PROCEDURE/PROC support. |
| grammar/statements/create.js | Wires in procedure rules but also adds a separate local T-SQL create_procedure implementation. |
| grammar/statements/create-procedure.js | New procedure grammar supporting multiple procedure body styles including T-SQL. |
| grammar/statements/create-function.js | Adjusts precedence to reduce conflicts with T-SQL function bodies/returns. |
| grammar/statements/comment.js | Adds COMMENT ON PROCEDURE ... target support. |
| grammar/keywords.js | Adds T-SQL keywords (e.g., proc, go, nocount, print, raiserror, goto, exec). |
| grammar/expressions.js | Expands object references for T-SQL multi-part names, adds bracket identifiers and @ parameters. |
| grammar.js | Updates top-level program rule (adds GO, optional semicolons) and conflict declarations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ), | ||
| ), | ||
| _tsql_bracket_identifier: _ => /\[[^\]]*\]/, | ||
| _tsql_parameter: _ => /@[A-Za-z_0-9]*/, |
There was a problem hiding this comment.
_tsql_parameter currently matches @ by itself (because of *) and also allows names starting with digits. That will accept invalid T-SQL and can create ambiguous parses. Tighten the regex to require at least one identifier character after @ and (ideally) enforce a valid starting character, e.g. @[A-Za-z_][A-Za-z_0-9]*.
| _tsql_parameter: _ => /@[A-Za-z_0-9]*/, | |
| _tsql_parameter: _ => /@[_A-Za-z\u00C0-\u017F][0-9A-Za-z_\u00C0-\u017F]*/, |
| @@ -0,0 +1 @@ | |||
| "SELECT * FROM TestProc" | |||
There was a problem hiding this comment.
This .sql fixture contains a quoted string literal rather than SQL text. If it’s intended to be parsed as SQL, drop the surrounding quotes; otherwise, consider relocating/documenting the fixture to avoid confusion.
| "SELECT * FROM TestProc" | |
| SELECT * FROM TestProc; |
| @@ -0,0 +1 @@ | |||
| "CREATE PROCEDURE [dbo].[TestProc] @ID INT AS BEGIN SELECT 1 END;" | |||
There was a problem hiding this comment.
This .sql fixture contains a quoted string literal rather than SQL text. If it’s intended to exercise the SQL grammar directly, remove the surrounding quotes; otherwise, consider relocating/documenting the fixture.
| "CREATE PROCEDURE [dbo].[TestProc] @ID INT AS BEGIN SELECT 1 END;" | |
| CREATE PROCEDURE [dbo].[TestProc] @ID INT AS BEGIN SELECT 1 END; |
| (create_procedure | ||
| (object_reference) @name) @definition.procedure | ||
|
|
||
| ; CREATE FUNCTION definitions |
There was a problem hiding this comment.
There are trailing spaces at the end of this comment line. Please trim them to avoid unnecessary diffs and keep formatting consistent.
| ; CREATE FUNCTION definitions | |
| ; CREATE FUNCTION definitions |
| $.create_extension, | ||
| $.create_trigger, | ||
| $.create_policy, | ||
| $.create_procedure, |
There was a problem hiding this comment.
_create_statement lists $.create_procedure twice, which is redundant and can mask issues when adjusting create parsing. Remove the duplicate entry so each create variant appears once.
| $.create_procedure, |
There was a problem hiding this comment.
that is correct, remove one entry of create_procedure
| seq( | ||
| choice( | ||
| $.statement, | ||
| $.block, |
There was a problem hiding this comment.
program no longer includes the transaction rule (it now only allows statement, block, or GO). Since the repository still has test/corpus/transaction.txt asserting a (transaction ...) node, this change will break existing corpus tests unless the transaction corpus (and any dependent queries) are updated accordingly or the transaction rule is reintroduced.
| $.block, | |
| $.block, | |
| $.transaction, |
| choice( | ||
| seq( | ||
| $.object_reference, | ||
| choice($.identifier, $.parameter), |
There was a problem hiding this comment.
set_statement previously used $.object_reference for the setting name, which matches the existing corpus expectations in test/corpus/set.txt. Switching to choice($.identifier, $.parameter) changes the parse tree shape (it will no longer produce an object_reference node) and will break those tests. If you need to support T-SQL variables here, consider using choice($.object_reference, $.parameter) so existing object_reference captures remain stable.
| choice($.identifier, $.parameter), | |
| choice($.object_reference, $.parameter), |
| keyword_raiserror: _ => make_keyword("raiserror"), | ||
| keyword_goto: _ => make_keyword("goto"), | ||
| keyword_exec: _ => make_keyword("exec"), | ||
| keyword_out: _ => make_keyword("out"), |
There was a problem hiding this comment.
keyword_out is defined twice in this file (once earlier and again here). Even though the definitions are identical, the duplicate key is easy to miss and can cause confusion during future edits; remove one of them to keep the keyword list canonical.
| keyword_out: _ => make_keyword("out"), |
| $.array, | ||
| $.interval, | ||
| $.between_expression, | ||
| $.parenthesized_expression, | ||
| $.object_id, | ||
| ) |
There was a problem hiding this comment.
The _expression choice list no longer includes an object_id expression, but the repo’s corpus still contains a "TSQL OBJECT_ID" test (test/corpus/expressions.txt) that expects an (object_id ...) node. Restoring object_id parsing (and keyword_object_id) or updating the corpus is needed to keep CI passing.
| keyword_security: _ => make_keyword("security"), | ||
| keyword_version: _ => make_keyword("version"), | ||
| keyword_extension: _ => make_keyword("extension"), | ||
| keyword_out: _ => make_keyword("out"), |
There was a problem hiding this comment.
This property is duplicated in a later property.
| keyword_out: _ => make_keyword("out"), |
matthias-Q
left a comment
There was a problem hiding this comment.
Sorry, but this PR mixes so many different changes. There are cross cutting changes that affect other parts, added unrelated files, formatting changes in places where no change is required.
Please go over the PR again and clean it up. Your copilot review also provided many helpful hints.
| @@ -0,0 +1 @@ | |||
| "CREATE PROCEDURE [dbo].[TestProc] @ID INT AS BEGIN SELECT 1 END;" | |||
There was a problem hiding this comment.
why are these SQL files part of the PR?
| @@ -0,0 +1 @@ | |||
| "SELECT * FROM TestProc" | |||
There was a problem hiding this comment.
why are these SQL files part of the PR?
| @@ -0,0 +1 @@ | |||
| "CREATE PROCEDURE TestProc AS SELECT 1;" | |||
There was a problem hiding this comment.
why are these SQL files part of the PR?
| @@ -0,0 +1 @@ | |||
| "CREATE PROCEDURE TestProc AS SELECT 1" | |||
There was a problem hiding this comment.
why are these SQL files part of the PR?
| $.create_extension, | ||
| $.create_trigger, | ||
| $.create_policy, | ||
| $.create_procedure, |
There was a problem hiding this comment.
that is correct, remove one entry of create_procedure
|
|
||
| create_procedure: $ => seq( | ||
| $.keyword_create, | ||
| optional($._or_replace), | ||
| choice($.keyword_procedure, $.keyword_proc), | ||
| $.object_reference, | ||
| optional( | ||
| repeat( | ||
| seq( | ||
| optional(','), | ||
| $.procedure_argument | ||
| ) | ||
| ) | ||
| ), | ||
| optional($.keyword_as), | ||
| choice( | ||
| $.statement, | ||
| $.block | ||
| ) | ||
| ), | ||
|
|
||
| procedure_argument: $ => seq( | ||
| field('name', $.parameter), | ||
| $._type, | ||
| optional(seq('=', choice($.literal, $.keyword_null))), | ||
| optional(choice($.keyword_out, $.keyword_output)) | ||
| ), |
There was a problem hiding this comment.
This is correct, create_procedure is defined twice
| repeat( | ||
| seq( | ||
| $.statement, | ||
| ';' |
There was a problem hiding this comment.
@DerekStride what do you think? should the ; be optional in general? It is a breaking change
There was a problem hiding this comment.
I think it's pretty load-bearing for other dialects, and optionality is deprecated even in SQL Server. An optional semicolon after the last statement of the seq might be nice.
| keyword_security: _ => make_keyword("security"), | ||
| keyword_version: _ => make_keyword("version"), | ||
| keyword_extension: _ => make_keyword("extension"), | ||
| keyword_out: _ => make_keyword("out"), |
| @@ -0,0 +1,19 @@ | |||
| ; Structure queries for T-SQL code navigation | |||
There was a problem hiding this comment.
Whats the purpose of having an extra file with "structure" queries?
I don't think this is salvageable as-is; it'll be all but impossible to keep discussion straight on dozens of material changes in one massive block. @JPMarichal please factor your changes out into individual, more focused pull requests. |
No description provided.