Hi,
hope this is an okay place to mention this...
I forked off this repo to play a bit with ballista+delta myself.
I built a small web api in front of the scheduler usign axum+axum-streams.
When yielding results as arrow ipc streams I came across one issue:
abdolence/axum-streams-rs#80
When yielding results from delta tables there seem to be cases where the schema as reported by the dataframe and the schema of individual batches don't match. E.g. in:
let df = ctx.sql("SELECT * FROM example").await.unwrap();
let batches = df.collect().await.unwrap();
let batch_schema = batches.get(0).unwrap().schema();
println!("df.schema = {:?}", df.schema().clone());
println!("batch_schema = {:?}", batch_schema);
In datafusion alone I cannot reproduce the error, only in combination with ballista. I guess this diff might occur due to ballista writing partition results to arrow & reading them again?
Glad about any insights !
Hi,
hope this is an okay place to mention this...
I forked off this repo to play a bit with ballista+delta myself.
I built a small web api in front of the scheduler usign axum+axum-streams.
When yielding results as arrow ipc streams I came across one issue:
abdolence/axum-streams-rs#80
When yielding results from delta tables there seem to be cases where the schema as reported by the dataframe and the schema of individual batches don't match. E.g. in:
In datafusion alone I cannot reproduce the error, only in combination with ballista. I guess this diff might occur due to ballista writing partition results to arrow & reading them again?
Glad about any insights !