-
Notifications
You must be signed in to change notification settings - Fork 639
feat(dtype): support compiling dtypes to sql #11100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like the implementation of this because it has a special case for each backend, for nearly identical code.
Also, as a steel main to your implementation, why shouldn't we also special case Schema
here too? I am against that for the same reason as this implemetation.
As an alternative, how about a method on DataType
, called to_sqlglot
, similar to how Schema.to_sqlglot
works?
Then we can have a very small amount of code in ibis.to_sql
that checks for Schema
or DataType
instances and then calls to_sqlglot
, rather than handling the compilation in a very special-casey way in each compiler.
Also, if there's a bug in the Oracle compilation unrelated to this PR, please submit a separate PR for that fix so it shows up in the release notes.
This sounds better, good idea, I will do that
Will do! |
16abc2c
to
69bb598
Compare
69bb598
to
fd78257
Compare
Well, it still is sorta ugly, but maybe less ugly than before. Once I added DataType.to_sqlglot() method, then this also begs for a DataType.from_sqlglot(sqlglot_type: sge.DataType) method, for symmetry with to_arrow/from_arrow, to_pandas/from_pandas, etc. The tricky thing is, I think the conversion sqlglot -> ibis depends on the sqlglot dialect. So it needs more context than the arrow/pandas/etc methods, and so would require an extra argument and a different signature. I think this is a smell that this is the wrong abstraction. I am tempted to think that the ibis<->sqlglot datatype conversion should only be happening on the backend-specific level of SqlGlotTypeMapper, and we shouldn't push it down to the backend-agnostic level of DataType. |
a06c48c
to
c780113
Compare
We already have a similar test that checks ibis/ibis/backends/tests/test_string.py Lines 78 to 117 in d55a5ee
This is a bit different though. I'm curious though, why that test, paramaterized with the |
790c400
to
6ca984d
Compare
This also makes me wonder if we want to expose a toplevel |
6ca984d
to
2e61454
Compare
Thinking about this more, I could simplify the Backend.to_sqlglot() logic by simply not supporting dt.DataType and sch.Schema. Then, the only place we would need the |
20d77bd
to
2d29680
Compare
Fixes #11073
EDIT: This Oracle fix was merged in #11124
Also includes a fix in the Oracle dtype compiler. The new tests cover this. If we change our tests, we should ensure that this code path is still tested. Also, not sure why we choose a default string length of 4000 for oracle, but 2 million for exasol and MAX for mssql. See the snapshots. Perhaps in a followup we should unify this?