-
Notifications
You must be signed in to change notification settings - Fork 605
Allow stored procedures to be defined without BEGIN
/END
#1834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/parser/mod.rs
Outdated
let begin_token: AttachedToken = self | ||
.expect_keyword(Keyword::BEGIN) | ||
.map(AttachedToken) | ||
.unwrap_or_else(|_| AttachedToken::empty()); | ||
let statements = self.parse_statement_list(&[Keyword::END])?; | ||
let end_token = match &begin_token.0.token { | ||
Token::Word(w) if w.keyword == Keyword::BEGIN => { | ||
AttachedToken(self.expect_keyword(Keyword::END)?) | ||
} | ||
_ => AttachedToken::empty(), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also do it like this: https://github.com/apache/datafusion-sqlparser-rs/pull/1810/files#diff-6f6a082c3ddfc1f16cc4a455e3e2e2d2508f77b682ab18cabee69471bbb3edb3R244-R260, not sure which is best
src/ast/mod.rs
Outdated
body: Vec<Statement>, | ||
body: BeginEndStatements, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It it an abuse of BeginEndStatements to potentially have the begin/end token be empty? This could alternately be an enum of BeginEndStatements and Sequence, similar to elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iffyio based on the recent discussion on the other PR, guessing you will prefer this also to be an enum, to avoid the empty begin/end tokens?
dd8382e
to
ffd3b6a
Compare
I have temporarily rebased this branch on #1810 to pick up the new helper function and similarly use the enum pattern to distinguish between Sequence & BeginEndStatements. |
ffd3b6a
to
038d3c2
Compare
@aharpervc could you take a look at this PR, it seems to have commits from other PRs like the CREATE TRIGGER in it, so hard to tell what's being introduced this one |
- formerly, a semicolon after the last statement in a procedure was non-canonical (because they were added via `join`); a `BeginEndStatements` statements list will always write them out - `BeginEndStatements` begin/end tokens won't be written when empty - EOF now concludes parsing a statement list
…tokens - this further consolidates with existing patterns
038d3c2
to
fc95b8f
Compare
Yes, there is similarity between the branches, which is why I had rebased this on the other. Now that it's merged I rebased on main so this branch only includes the distinct commits. |
@@ -100,48 +100,52 @@ fn parse_mssql_delimited_identifiers() { | |||
|
|||
#[test] | |||
fn parse_create_procedure() { | |||
let sql = "CREATE OR ALTER PROCEDURE test (@foo INT, @bar VARCHAR(256)) AS BEGIN SELECT 1 END"; | |||
let sql = "CREATE OR ALTER PROCEDURE test (@foo INT, @bar VARCHAR(256)) AS BEGIN SELECT 1; END"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the semicolon is coming from format_statement_list
here:
datafusion-sqlparser-rs/src/ast/mod.rs
Lines 163 to 165 in ac1c339
// We manually insert semicolon for the last statement, | |
// since display_separated doesn't handle that case. | |
write!(f, ";") |
Not sure if we need to change any of that in this PR. It seems odd that semicolons weren't part of the canonical SQL previously (it was written out as body = display_separated(body, "; ")
without the supplementary semi colon) 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this behavior is OK, we try to preserve roundtrip behavior where possible but I'm suspecting that in this case, the effort required to work around it would not be worth the return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @aharpervc!
cc @alamb
@@ -100,48 +100,52 @@ fn parse_mssql_delimited_identifiers() { | |||
|
|||
#[test] | |||
fn parse_create_procedure() { | |||
let sql = "CREATE OR ALTER PROCEDURE test (@foo INT, @bar VARCHAR(256)) AS BEGIN SELECT 1 END"; | |||
let sql = "CREATE OR ALTER PROCEDURE test (@foo INT, @bar VARCHAR(256)) AS BEGIN SELECT 1; END"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this behavior is OK, we try to preserve roundtrip behavior where possible but I'm suspecting that in this case, the effort required to work around it would not be worth the return
For SQL Server, you can make a stored procedure without begin/end (docs ref). Otherwise, it parses the same way.
To differentiate with/without in the parser, the stored procedure struct's
statements
field was changed fromVec<Statement>
(where begin/end are required & implicit) to aBeginEndStatements
, where the begin/end tokens are explicit. They're empty when missing & written when present.This PR also includes the fix to allow EOF to end a statement list from #1831 (so whichever merges first, I'll rebase accordingly)
The diff is perhaps larger than expected due to the question of canonical semicolons for procedure statement bodies. Formerly, a semicolon after the last statement in a procedure was non-canonical (because they were added via
join
... so perhaps not particular intentional for it to have been that way); aBeginEndStatements
statements list will always write them out.An additional test case example without begin/end has been added as well.