Skip to content

Commit e2a651a

Browse files
authored
Merge pull request #28 from PeerDB-io/merge-upstream
Merge 0.40 + apache/datafusion-sqlparser-rs#1040 Upstream implemented END parsing as COMMIT on postgresql: apache/datafusion-sqlparser-rs#1035 allowing some convergence
2 parents d7c5248 + 5ee679f commit e2a651a

27 files changed

+1236
-188
lines changed

.tool-versions

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
rust 1.73.0

CHANGELOG.md

+23
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,29 @@ Given that the parser produces a typed AST, any changes to the AST will technica
88
## [Unreleased]
99
Check https://github.com/sqlparser-rs/sqlparser-rs/commits/main for undocumented changes.
1010

11+
12+
## [0.40.0] 2023-11-27
13+
14+
### Added
15+
* Add `{pre,post}_visit_query` to `Visitor` (#1044) - Thanks @jmhain
16+
* Support generated virtual columns with expression (#1051) - Thanks @takluyver
17+
* Support PostgreSQL `END` (#1035) - Thanks @tobyhede
18+
* Support `INSERT INTO ... DEFAULT VALUES ...` (#1036) - Thanks @CDThomas
19+
* Support `RELEASE` and `ROLLBACK TO SAVEPOINT` (#1045) - Thanks @CDThomas
20+
* Support `CONVERT` expressions (#1048) - Thanks @lovasoa
21+
* Support `GLOBAL` and `SESSION` parts in `SHOW VARIABLES` for mysql and generic - Thanks @emin100
22+
* Support snowflake `PIVOT` on derived table factors (#1027) - Thanks @lustefaniak
23+
* Support mssql json and xml extensions (#1043) - Thanks @lovasoa
24+
* Support for `MAX` as a character length (#1038) - Thanks @lovasoa
25+
* Support `IN ()` syntax of SQLite (#1028) - Thanks @alamb
26+
27+
### Fixed
28+
* Fix extra whitespace printed before `ON CONFLICT` (#1037) - Thanks @CDThomas
29+
30+
### Changed
31+
* Document round trip ability (#1052) - Thanks @alamb
32+
* Add PRQL to list of users (#1031) - Thanks @vanillajonathan
33+
1134
## [0.39.0] 2023-10-27
1235

1336
### Added

Cargo.toml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22
name = "sqlparser"
33
description = "Extensible SQL Lexer and Parser with support for ANSI SQL:2011"
4-
version = "0.39.0"
4+
version = "0.40.0"
55
authors = ["Andy Grove <[email protected]>"]
66
homepage = "https://github.com/sqlparser-rs/sqlparser-rs"
77
documentation = "https://docs.rs/sqlparser/"
@@ -34,7 +34,7 @@ serde = { version = "1.0", features = ["derive"], optional = true }
3434
# of dev-dependencies because of
3535
# https://github.com/rust-lang/cargo/issues/1596
3636
serde_json = { version = "1.0", optional = true }
37-
sqlparser_derive = { version = "0.1.1", path = "derive", optional = true }
37+
sqlparser_derive = { version = "0.2.0", path = "derive", optional = true }
3838

3939
[dev-dependencies]
4040
simple_logger = "4.0"

README.md

+24-1
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,28 @@ This crate avoids semantic analysis because it varies drastically
5959
between dialects and implementations. If you want to do semantic
6060
analysis, feel free to use this project as a base.
6161

62+
## Preserves Syntax Round Trip
63+
64+
This crate allows users to recover the original SQL text (with normalized
65+
whitespace and keyword capitalization), which is useful for tools that
66+
analyze and manipulate SQL.
67+
68+
This means that other than whitespace and the capitalization of keywords, the
69+
following should hold true for all SQL:
70+
71+
```rust
72+
// Parse SQL
73+
let ast = Parser::parse_sql(&GenericDialect, sql).unwrap();
74+
75+
// The original SQL text can be generated from the AST
76+
assert_eq!(ast[0].to_string(), sql);
77+
```
78+
79+
There are still some cases in this crate where different SQL with seemingly
80+
similar semantics are represented with the same AST. We welcome PRs to fix such
81+
issues and distinguish different syntaxes in the AST.
82+
83+
6284
## SQL compliance
6385

6486
SQL was first standardized in 1987, and revisions of the standard have been
@@ -93,7 +115,7 @@ $ cargo run --features json_example --example cli FILENAME.sql [--dialectname]
93115
## Users
94116

95117
This parser is currently being used by the [DataFusion] query engine,
96-
[LocustDB], [Ballista], [GlueSQL], [Opteryx], and [JumpWire].
118+
[LocustDB], [Ballista], [GlueSQL], [Opteryx], [PRQL], and [JumpWire].
97119

98120
If your project is using sqlparser-rs feel free to make a PR to add it
99121
to this list.
@@ -188,6 +210,7 @@ licensed as above, without any additional terms or conditions.
188210
[Ballista]: https://github.com/apache/arrow-ballista
189211
[GlueSQL]: https://github.com/gluesql/gluesql
190212
[Opteryx]: https://github.com/mabel-dev/opteryx
213+
[PRQL]: https://github.com/PRQL/prql
191214
[JumpWire]: https://github.com/extragoodlabs/jumpwire
192215
[Pratt Parser]: https://tdop.github.io/
193216
[sql-2016-grammar]: https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html

derive/Cargo.toml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22
name = "sqlparser_derive"
33
description = "proc macro for sqlparser"
4-
version = "0.1.1"
4+
version = "0.2.1"
55
authors = ["sqlparser-rs authors"]
66
homepage = "https://github.com/sqlparser-rs/sqlparser-rs"
77
documentation = "https://docs.rs/sqlparser_derive/"
@@ -18,6 +18,6 @@ edition = "2021"
1818
proc-macro = true
1919

2020
[dependencies]
21-
syn = "1.0"
21+
syn = { version = "2.0", default-features = false, features = ["printing", "parsing", "derive", "proc-macro"] }
2222
proc-macro2 = "1.0"
2323
quote = "1.0"

derive/README.md

+81-12
Original file line numberDiff line numberDiff line change
@@ -48,33 +48,102 @@ impl Visit for Bar {
4848
}
4949
```
5050

51-
Additionally certain types may wish to call a corresponding method on visitor before recursing
51+
Some types may wish to call a corresponding method on the visitor:
5252

5353
```rust
5454
#[derive(Visit, VisitMut)]
5555
#[visit(with = "visit_expr")]
5656
enum Expr {
57-
A(),
58-
B(String, #[cfg_attr(feature = "visitor", visit(with = "visit_relation"))] ObjectName, bool),
57+
IsNull(Box<Expr>),
58+
..
5959
}
6060
```
6161

62-
Will generate
62+
This will result in the following sequence of visitor calls when an `IsNull`
63+
expression is visited
64+
65+
```
66+
visitor.pre_visit_expr(<is null expr>)
67+
visitor.pre_visit_expr(<is null operand>)
68+
visitor.post_visit_expr(<is null operand>)
69+
visitor.post_visit_expr(<is null expr>)
70+
```
71+
72+
For some types it is only appropriate to call a particular visitor method in
73+
some contexts. For example, not every `ObjectName` refers to a relation.
74+
75+
In these cases, the `visit` attribute can be used on the field for which we'd
76+
like to call the method:
6377

6478
```rust
65-
impl Visit for Bar {
79+
#[derive(Visit, VisitMut)]
80+
#[visit(with = "visit_table_factor")]
81+
pub enum TableFactor {
82+
Table {
83+
#[visit(with = "visit_relation")]
84+
name: ObjectName,
85+
alias: Option<TableAlias>,
86+
},
87+
..
88+
}
89+
```
90+
91+
This will generate
92+
93+
```rust
94+
impl Visit for TableFactor {
6695
fn visit<V: Visitor>(&self, visitor: &mut V) -> ControlFlow<V::Break> {
67-
visitor.visit_expr(self)?;
96+
visitor.pre_visit_table_factor(self)?;
6897
match self {
69-
Self::A() => {}
70-
Self::B(_1, _2, _3) => {
71-
_1.visit(visitor)?;
72-
visitor.visit_relation(_3)?;
73-
_2.visit(visitor)?;
74-
_3.visit(visitor)?;
98+
Self::Table { name, alias } => {
99+
visitor.pre_visit_relation(name)?;
100+
alias.visit(name)?;
101+
visitor.post_visit_relation(name)?;
102+
alias.visit(visitor)?;
75103
}
76104
}
105+
visitor.post_visit_table_factor(self)?;
77106
ControlFlow::Continue(())
78107
}
79108
}
80109
```
110+
111+
Note that annotating both the type and the field is incorrect as it will result
112+
in redundant calls to the method. For example
113+
114+
```rust
115+
#[derive(Visit, VisitMut)]
116+
#[visit(with = "visit_expr")]
117+
enum Expr {
118+
IsNull(#[visit(with = "visit_expr")] Box<Expr>),
119+
..
120+
}
121+
```
122+
123+
will result in these calls to the visitor
124+
125+
126+
```
127+
visitor.pre_visit_expr(<is null expr>)
128+
visitor.pre_visit_expr(<is null operand>)
129+
visitor.pre_visit_expr(<is null operand>)
130+
visitor.post_visit_expr(<is null operand>)
131+
visitor.post_visit_expr(<is null operand>)
132+
visitor.post_visit_expr(<is null expr>)
133+
```
134+
135+
## Releasing
136+
137+
This crate's release is not automated. Instead it is released manually as needed
138+
139+
Steps:
140+
1. Update the version in `Cargo.toml`
141+
2. Update the corresponding version in `../Cargo.toml`
142+
3. Commit via PR
143+
4. Publish to crates.io:
144+
145+
```shell
146+
# update to latest checked in main branch and publish via
147+
cargo publish
148+
```
149+

derive/src/lib.rs

+29-23
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,9 @@ use proc_macro2::TokenStream;
22
use quote::{format_ident, quote, quote_spanned, ToTokens};
33
use syn::spanned::Spanned;
44
use syn::{
5-
parse_macro_input, parse_quote, Attribute, Data, DeriveInput, Fields, GenericParam, Generics,
6-
Ident, Index, Lit, Meta, MetaNameValue, NestedMeta,
5+
parse::{Parse, ParseStream},
6+
parse_macro_input, parse_quote, Attribute, Data, DeriveInput,
7+
Fields, GenericParam, Generics, Ident, Index, LitStr, Meta, Token
78
};
89

910
/// Implementation of `[#derive(Visit)]`
@@ -84,38 +85,43 @@ struct Attributes {
8485
with: Option<Ident>,
8586
}
8687

88+
struct WithIdent {
89+
with: Option<Ident>,
90+
}
91+
impl Parse for WithIdent {
92+
fn parse(input: ParseStream) -> Result<Self, syn::Error> {
93+
let mut result = WithIdent { with: None };
94+
let ident = input.parse::<Ident>()?;
95+
if ident != "with" {
96+
return Err(syn::Error::new(ident.span(), "Expected identifier to be `with`"));
97+
}
98+
input.parse::<Token!(=)>()?;
99+
let s = input.parse::<LitStr>()?;
100+
result.with = Some(format_ident!("{}", s.value(), span = s.span()));
101+
Ok(result)
102+
}
103+
}
104+
87105
impl Attributes {
88106
fn parse(attrs: &[Attribute]) -> Self {
89107
let mut out = Self::default();
90-
for attr in attrs.iter().filter(|a| a.path.is_ident("visit")) {
91-
let meta = attr.parse_meta().expect("visit attribute");
92-
match meta {
93-
Meta::List(l) => {
94-
for nested in &l.nested {
95-
match nested {
96-
NestedMeta::Meta(Meta::NameValue(v)) => out.parse_name_value(v),
97-
_ => panic!("Expected #[visit(key = \"value\")]"),
108+
for attr in attrs {
109+
if let Meta::List(ref metalist) = attr.meta {
110+
if metalist.path.is_ident("visit") {
111+
match syn::parse2::<WithIdent>(metalist.tokens.clone()) {
112+
Ok(with_ident) => {
113+
out.with = with_ident.with;
114+
}
115+
Err(e) => {
116+
panic!("{}", e);
98117
}
99118
}
100119
}
101-
_ => panic!("Expected #[visit(...)]"),
102120
}
103121
}
104122
out
105123
}
106124

107-
/// Updates self with a name value attribute
108-
fn parse_name_value(&mut self, v: &MetaNameValue) {
109-
if v.path.is_ident("with") {
110-
match &v.lit {
111-
Lit::Str(s) => self.with = Some(format_ident!("{}", s.value(), span = s.span())),
112-
_ => panic!("Expected a string value, got {}", v.lit.to_token_stream()),
113-
}
114-
return;
115-
}
116-
panic!("Unrecognised kv attribute {}", v.path.to_token_stream())
117-
}
118-
119125
/// Returns the pre and post visit token streams
120126
fn visit(&self, s: TokenStream) -> (Option<TokenStream>, Option<TokenStream>) {
121127
let pre_visit = self.with.as_ref().map(|m| {

src/ast/data_type.rs

+20-9
Original file line numberDiff line numberDiff line change
@@ -374,14 +374,14 @@ impl fmt::Display for DataType {
374374
}
375375
write!(f, ")")
376376
}
377-
DataType::SnowflakeTimestamp => write!(f, "TIMESTAMP_NTZ"),
378377
DataType::Struct(fields) => {
379378
if !fields.is_empty() {
380379
write!(f, "STRUCT<{}>", display_comma_separated(fields))
381380
} else {
382381
write!(f, "STRUCT")
383382
}
384383
}
384+
DataType::SnowflakeTimestamp => write!(f, "TIMESTAMP_NTZ"),
385385
}
386386
}
387387
}
@@ -521,18 +521,29 @@ impl fmt::Display for ExactNumberInfo {
521521
#[derive(Debug, Copy, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
522522
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
523523
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
524-
pub struct CharacterLength {
525-
/// Default (if VARYING) or maximum (if not VARYING) length
526-
pub length: u64,
527-
/// Optional unit. If not informed, the ANSI handles it as CHARACTERS implicitly
528-
pub unit: Option<CharLengthUnits>,
524+
pub enum CharacterLength {
525+
IntegerLength {
526+
/// Default (if VARYING) or maximum (if not VARYING) length
527+
length: u64,
528+
/// Optional unit. If not informed, the ANSI handles it as CHARACTERS implicitly
529+
unit: Option<CharLengthUnits>,
530+
},
531+
/// VARCHAR(MAX) or NVARCHAR(MAX), used in T-SQL (Miscrosoft SQL Server)
532+
Max,
529533
}
530534

531535
impl fmt::Display for CharacterLength {
532536
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
533-
write!(f, "{}", self.length)?;
534-
if let Some(unit) = &self.unit {
535-
write!(f, " {unit}")?;
537+
match self {
538+
CharacterLength::IntegerLength { length, unit } => {
539+
write!(f, "{}", length)?;
540+
if let Some(unit) = unit {
541+
write!(f, " {unit}")?;
542+
}
543+
}
544+
CharacterLength::Max => {
545+
write!(f, "MAX")?;
546+
}
536547
}
537548
Ok(())
538549
}

0 commit comments

Comments
 (0)