-
Couldn't load subscription status.
- Fork 537
Open
Labels
binding/rustIssues for the Rust crateIssues for the Rust crate
Description
What happened?
I have been extensively using the python library for a while now and I try to uses this library directly in rust now however I'm having a hard time getting it to work the way I expect.
Every time I write data in rust I end up with files of size ~8192 rows and this happens whatever I put in the properties below. They are completely ignored. I'm probably missing something very obvious here but I can't figure it out from the documentation.
use std::sync::Arc;
use arrow::array::Int32Array;
use arrow::datatypes::{DataType, Field, Schema};
use arrow::record_batch::RecordBatch;
use deltalake_core::DeltaOps;
use anyhow::Result;
use url::Url;
#[tokio::main]
async fn main() -> Result<()> {
// Generate 500,000 rows of data
let num_rows = 500_000;
let ids: Vec<i32> = (1..=num_rows).collect();
// Create Arrow schema
let id_field = Field::new("id", DataType::Int32, false);
let schema = Arc::new(Schema::new(vec![id_field]));
// Create Arrow array and RecordBatch
let ids_array = Int32Array::from(ids);
let batch = RecordBatch::try_new(schema, vec![Arc::new(ids_array)])?;
println!("Created RecordBatch with {} rows", batch.num_rows());
// Write to Delta Lake
let table_path = "/path_to_delta_table/";
std::fs::create_dir_all(table_path).ok();
let ops = DeltaOps::try_from_uri(Url::parse(&format!("file://{}", table_path)).unwrap()).await?;
let mut table = ops.write(vec![batch]);
table = table.with_target_file_size(512_000_000);
table = table.with_write_batch_size(512_000_000);
table.await?;
// Read back and check the files created
let table_files = std::fs::read_dir(table_path)?;
println!("Wrote {} files.", table_files.count());
Ok(())
}In python this works perfectly:
deltalake.write_deltalake(
data=df,
table_or_uri="deltalable",
target_file_size=int(512e6)
)Expected behavior
I should have only one file
Operating System
None
Binding
None
Bindings Version
deltalake-core = { version = "0.29.1"}
Steps to reproduce
No response
Relevant logs
Metadata
Metadata
Assignees
Labels
binding/rustIssues for the Rust crateIssues for the Rust crate