Skip to content

bendpy: register_csv() fails with 'Query from CSV file lacks column positions' #19443

@bohutang

Description

@bohutang

Bug

register_csv() in the Python binding generates a CREATE VIEW with SELECT *, which is not supported for CSV/TSV files.

import databend
ctx = databend.SessionContext()
ctx.register_csv("foods", "/par/foods.csv")

Error:

RuntimeError: DataFrame collect error: SemanticError. Code: 1065, Text = [QUERY-CTX] Query from CSV file lacks column positions. Specify as $1, $2, etc.

Root Cause

register_table() in src/bendpy/src/context.rs generates:

CREATE VIEW foods AS SELECT * FROM 'fs:///par/foods.csv' (file_format => 'csv')

CSV/TSV files require explicit column positions ($1, $2, ...) instead of SELECT *, because they lack schema metadata (unlike Parquet).

Proposed Fix

For CSV/TSV formats, call infer_schema() first to determine column count and names, then generate SELECT $1 AS col1, $2 AS col2, ... instead of SELECT *.

register_tsv() has the same issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions