Description
Is your feature request related to a problem?
In duckdb >= 1.2.0, we can use [1,2,3] || NULL || [4,5,6]
as our method for ArrayConcat. Currently however, ibis is trying to support older version of duckdb, that don't have this feature, so we fallback of checking NULLs ourself. This is not performant because it ends up making sub expressions being repeated multiple times.
I personally am using duckdb 1.2, so I want to be able to use this new syntax. Is there a way we can tell ibis what duckdb version to target? I understand how we don't want to rely on the duckdb package actually being installed to merely generate the SQL. But if it IS installed, can we look at the version and use the more modern/optimized syntax in that case?
related: #10999
What is the motivation behind your request?
The query I have in #10996 is slow, generating way more SQL than is needed. I think if I trim it down that will make it faster. Perhaps we should be skeptical of me just guessing that it will make it faster, if you want I can actually write the benchmarking code.
Describe the solution you'd like
I'm not sure if we should one-off this for duckdb, or set up a more general framework/API for people to specify a target version for whatever backend they are working in.
Could do something like this:
import importlib
import importlib.metadata
import re
def min_supported_backend_version(backend: str):
dist = importlib.metadata.distribution("ibis-framework")
requirements = [r for r in dist.requires if r.startswith(backend)]
pattern = f"{backend}>=(?P<version>.+);"
versions = [re.search(pattern, r).group("version").split(".") for r in requirements]
min_version = ".".join(min(versions))
return min_version
def target_backend_version(backend: str):
try:
return importlib.metadata.version(backend)
except importlib.metadata.PackageNotFoundError:
return min_supported_backend_version(backend)
which works for duckdb and datafusion, where the backend version is exactly the same as the pypi package version. This doesn't work for eg postgres, where there is no single pypi package called postgres which is the same version as the database version. So we could just hardcode this in ibis/__init__.py
as the min syntax versions we support.
IDK how people could specify the version they want. in ibis.options? That would be simplest, but not great since it is global. Ideally people could choose different versions for every ibis.to_sql()
call, but that is probably overkill.
What version of ibis are you running?
main
What backend(s) are you using, if any?
duckdb
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Type
Projects
Status
backlog