Skip to content

bug: Errors with NULL columns #9669

Open
@Riezebos

Description

What happened?

When a column has only nulls, I can't create a DuckDB table or pyarrow table from it.

Here are some reproducible examples:

import ibis

con = ibis.duckdb.connect()

data = [{"col1": 1, "col2": None}, {"col1": 4, "col2": None}]

t = ibis.memtable(data)
con.create_table("test", t)

Result: ParserException: Parser Error: syntax error at or near "NULL"

t.execute() does work in the above example.

import ibis

con = ibis.duckdb.connect()

data = [{"col1": 1, "col2": None}, {"col1": 4, "col2": None}]

ibis.memtable(data).to_pyarrow()

Result: ArrowNotImplementedError: Unsupported cast from int32 to null using function cast_null

import ibis
import pyarrow as pa

con = ibis.duckdb.connect()

data = [{"col1": 1, "col2": None}, {"col1": 4, "col2": None}]

array = pa.array(data)
pa_table = pa.Table.from_struct_array(array)

con.create_table("test", pa_table)

Result: ParserException: Parser Error: syntax error at or near "NULL"

I am guessing the problem is that PyArrow supports columns having a datatype of null, which most databases probably don't?

DuckDB apparently converts the null datatype into int32:

import duckdb

duckdb.from_arrow(pa_table)
┌───────┬───────┐
│ col1  │ col2  │
│ int64 │ int32 │
├───────┼───────┤
│     1 │  NULL │
│     4 │  NULL │
└───────┴───────┘

I have no idea what the best way to handle this is, maybe raising an exception asking the user to specify a schema when a NULL column exists?

A job I run daily suddenly started giving the first error. With the error I got, it took some experimenting to figure out that it was actually caused by this issue. A column in the source data (from some API) that usually has strings and nulls now had only nulls.

What version of ibis are you using?

9.2.0

What backend(s) are you using, if any?

DuckDB

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior inside of ibis

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions