Skip to content

feat: If a backend is installed, use that version as the SQL compile target #11000

Open
@NickCrews

Description

@NickCrews

Is your feature request related to a problem?

In duckdb >= 1.2.0, we can use [1,2,3] || NULL || [4,5,6] as our method for ArrayConcat. Currently however, ibis is trying to support older version of duckdb, that don't have this feature, so we fallback of checking NULLs ourself. This is not performant because it ends up making sub expressions being repeated multiple times.

I personally am using duckdb 1.2, so I want to be able to use this new syntax. Is there a way we can tell ibis what duckdb version to target? I understand how we don't want to rely on the duckdb package actually being installed to merely generate the SQL. But if it IS installed, can we look at the version and use the more modern/optimized syntax in that case?

related: #10999

What is the motivation behind your request?

The query I have in #10996 is slow, generating way more SQL than is needed. I think if I trim it down that will make it faster. Perhaps we should be skeptical of me just guessing that it will make it faster, if you want I can actually write the benchmarking code.

Describe the solution you'd like

I'm not sure if we should one-off this for duckdb, or set up a more general framework/API for people to specify a target version for whatever backend they are working in.

Could do something like this:

import importlib
import importlib.metadata
import re


def min_supported_backend_version(backend: str):
    dist = importlib.metadata.distribution("ibis-framework")
    requirements = [r for r in dist.requires if r.startswith(backend)]
    pattern = f"{backend}>=(?P<version>.+);"
    versions = [re.search(pattern, r).group("version").split(".") for r in requirements]
    min_version = ".".join(min(versions))
    return min_version


def target_backend_version(backend: str):
    try:
        return importlib.metadata.version(backend)
    except importlib.metadata.PackageNotFoundError:
        return min_supported_backend_version(backend)

which works for duckdb and datafusion, where the backend version is exactly the same as the pypi package version. This doesn't work for eg postgres, where there is no single pypi package called postgres which is the same version as the database version. So we could just hardcode this in ibis/__init__.py as the min syntax versions we support.

IDK how people could specify the version they want. in ibis.options? That would be simplest, but not great since it is global. Ideally people could choose different versions for every ibis.to_sql() call, but that is probably overkill.

What version of ibis are you running?

main

What backend(s) are you using, if any?

duckdb

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeatures or general enhancements

    Type

    No type

    Projects

    • Status

      backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions