-
Notifications
You must be signed in to change notification settings - Fork 851
Open
Labels
C-improvementCategory: improvementCategory: improvementgood first issueCategory: good first issueCategory: good first issue
Description
Bug
register_csv() in the Python binding generates a CREATE VIEW with SELECT *, which is not supported for CSV/TSV files.
import databend
ctx = databend.SessionContext()
ctx.register_csv("foods", "/par/foods.csv")Error:
RuntimeError: DataFrame collect error: SemanticError. Code: 1065, Text = [QUERY-CTX] Query from CSV file lacks column positions. Specify as $1, $2, etc.
Root Cause
register_table() in src/bendpy/src/context.rs generates:
CREATE VIEW foods AS SELECT * FROM 'fs:///par/foods.csv' (file_format => 'csv')CSV/TSV files require explicit column positions ($1, $2, ...) instead of SELECT *, because they lack schema metadata (unlike Parquet).
Proposed Fix
For CSV/TSV formats, call infer_schema() first to determine column count and names, then generate SELECT $1 AS col1, $2 AS col2, ... instead of SELECT *.
register_tsv() has the same issue.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
C-improvementCategory: improvementCategory: improvementgood first issueCategory: good first issueCategory: good first issue