-
Notifications
You must be signed in to change notification settings - Fork 605
Add support for parsing with semicolons optional #1843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add support for parsing with semicolons optional #1843
Conversation
- formerly, a semicolon after the last statement in a procedure was non-canonical (because they were added via `join`); a `BeginEndStatements` statements list will always write them out - `BeginEndStatements` begin/end tokens won't be written when empty - EOF now concludes parsing a statement list
…tokens - this further consolidates with existing patterns
- plus, a corresponding `supports_statements_without_semicolon_delimiter` Dialect trait function - this is optional for SQL Server, so it's set to true - for the implementation, `RETURN` parsing needs to be tightened up to avoid ambiguity & tests that formerly asserted "end of statement" now maybe need to assert "an SQL statement" - a new `assert_err_parse_statements` splits the dialects based on semicolon requirements & asserts the expected error message accordingly
Keeping this in draft at the moment to work out the "compile-no-std" ci job, and possibly some more test cases. However, anyone feel free to post a review/thoughts |
- at least all of select/insert/update/delete (plus exec) can be added
@@ -666,6 +666,23 @@ fn parse_select_with_table_alias() { | |||
); | |||
} | |||
|
|||
#[test] | |||
fn parse_consecutive_queries() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the best approach here is, or if there's even a better solution apart from disallowing any unquoted keyword as an alias
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out doing that seems reasonable and reduces ambiguity. Dialects that require semicolon delimiters remain unaffected, but dialects where it's optional would need to quote the keyword as an identifier. That seems completely reasonable from a parser perspective.
I'm willing to drop the commit that added keywords to RESERVED_FOR_TABLE_ALIAS since this approach addresses that concern.
6eeebfe
to
2585c5b
Compare
Note: this PR is temporarily rebased on #1834 before that is merged
This PR introduces support for parsing SQL without requiring semicolon statement delimiters. The implementation is as follows:
require_semicolon_statement_delimiter
that defaults to a new Dialect functionsupports_statements_without_semicolon_delimiter
To make this work, the following supplementary changes were required:
RETURN
statement parsing was tightened up. I think I suggested in the prior PR for that function that it might need attention in the future (turns out the future is now). The reason this needed work is that without a semicolon to terminate the statement, it's hard to know if the thing following RETURN is part of RETURN or it's own thing. That ambiguity is addressed by having parse_return only allow a strict list of "returnable expr's" instead of any expr. This could have been it's own stand-alone PR but seems reasonable to review together instead of having a deeper stack of branches.Expected: end of statement, found: asdf
. When semicolons are optional, the parse error becomesExpected: an SQL statement, found: asdf
. Since this is otherwise an acceptable error for that kind of test, a new helper was introducedassert_err_parse_statements
that splits the dialects based on semicolon optional/required and asserts the appropriate error. The helper also has a pleasant side effect of reducing the volume of extremely similar error assertions.statements_without_semicolons_parse_to
helper was added. It's the same asstatements_parse_to
but it will strip semicolons from the SQL text. This makes it easier to add supplementary assertions for both SQL variants.test_supports_statements_without_semicolon_delimiter
test case to the SQL Server tests, to assert the proper AST for several examples not using semicolons. This is a belt and suspenders test to validate that parsing still produces the expected tree even without semicolons.I understand this is a big change, and possibly controversial. However like
GO
batch delimiters in #1809, this enhancement takes a BIG step towards having this library be able to successfully parse real-world SQL files.Fixes #1800