You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/ai/limitations.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,7 @@ This file is intentionally direct. Use it to prevent incorrect SDK integrations
39
39
40
40
- The public APIs are asynchronous, but backend execution does not imply unlimited parallelism.
41
41
- Flight clients own a gRPC channel per `DeltaTableServiceClient` instance.
42
-
- V3 native calls cross a synchronous FFI boundary and use a native engine handle.
42
+
- V3 native uses a native engine handle; main operations use callback-notified async operation handles, while batch pulls and imported write sources still follow Arrow C Stream ownership rules.
43
43
- Avoid issuing multiple concurrent native operations through the same client unless the caller has validated backend behavior for that scenario.
44
44
- Partitioned reads are the preferred pattern for independent parallel reads on V3.
Copy file name to clipboardExpand all lines: docs/architecture/execution-model.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,11 +66,14 @@ Properties:
66
66
- Uses Arrow C Data/C Stream interfaces for schemas and batches.
67
67
- Uses direct Arrow C Stream pulls by default, with bounded Rust-side prefetch available through `DeltaTableServiceClientOptions.EnableNativeReadPrefetch`.
68
68
- Owns a native engine handle through a `SafeHandle` wrapper.
69
+
- Uses a native async operation `SafeHandle` for V3 schema reads, partition planning, JSON-result table operations, read stream setup, and insert/merge setup.
69
70
- Shares one process-wide Tokio runtime across native engine handles.
70
71
- Avoids the Flight service boundary but requires native runtime assets.
71
72
72
73
Each `NativeRustBackend` still owns its native engine handle and per-engine error state. The shared runtime reduces thread and stack reservation overhead when multiple V3 clients are created in the same process. Native merge work is scheduled onto Tokio worker threads so deep delta-rs/DataFusion merge execution does not run on the .NET caller stack.
73
74
75
+
Schema reads, `GetReadPartitionsAsync`, table creation, protocol upgrade, SQL DML operations, table/query/CDF/partition stream setup, and insert/merge setup start callback-notified native operations on the shared Tokio runtime. The managed backend awaits a `TaskCompletionSource`, takes the schema, JSON result, or Arrow stream once after native completion is signaled, and cancels the native operation if the managed cancellation token is signaled. The public API and result models are unchanged.
76
+
74
77
When read-stream prefetch is enabled, production is bounded in two ways: each exported stream has a small native queue, and active backend batch production is capped process-wide. These limits provide backpressure for high-concurrency readers without changing the public `IAsyncEnumerable<RecordBatch>` or `IArrowArrayStream` shapes. The default read path remains direct batch pulling because local benchmarks showed prefetch overhead can dominate small/local reads.
| schema | Arrow C Data schema | Managed code imports schema and frees temporary native structures. |
47
+
| schema | Arrow C Data schema | Managed code imports schema and frees temporary native structures; async schema reads take the schema result exactly once. |
47
48
| read batches | Arrow C Stream | Imported managed stream owns the release callback; Rust can use bounded prefetch behind the stream when enabled. |
48
-
| write batches | Arrow C Stream | Managed stream is exported to Rust for operation duration. |
49
+
| write batches | Arrow C Stream | Managed stream is exported to Rust for operation duration; async insert and merge keep the exported stream and native storage alive until completion notification. |
50
+
| one-shot async operation | native operation pointer | Managed code awaits a `TaskCompletionSource`, takes the owned result string or Arrow stream once after native completion notification, and destroys the operation handle. |
The public API is asynchronous, but V3 crosses a synchronous FFI boundary for native calls. Do not assume unlimited parallelism through a single client instance. For parallel reads, prefer V3 partition planning and independent partition consumption.
74
+
The public API is asynchronous, and the main V3 native operations use callback-notified native operation handles. Do not assume unlimited parallelism through a single client instance. For parallel reads, prefer V3 partition planning and independent partition consumption.
75
+
76
+
Schema reads, partition planning, table creation, protocol upgrade, SQL DML operations, table/query/CDF/partition stream setup, and insert/merge setup use native async operation handles with completion notification. Managed code starts the relevant `*_async_with_callback` export, awaits a `TaskCompletionSource`, takes the result string with `dts_async_operation_take_result`, the schema with `dts_async_operation_take_schema`, or the stream with `dts_async_operation_take_stream` after the native callback fires, and releases the handle through `dts_async_operation_destroy`. Cancellation requests call `dts_async_operation_cancel` before managed code surfaces `OperationCanceledException`. This keeps the public API shapes unchanged while moving those one-shot operations onto the shared Tokio runtime instead of blocking the managed caller thread for the whole native operation.
77
+
78
+
For async insert and merge, Rust imports the caller-provided Arrow C Stream before spawning the write task. Managed code keeps both the exported `IArrowArrayStream` adapter and the `CArrowArrayStream` storage alive until native completion is signaled, then disposes and frees them. Cancellation waits for the aborted native task to drop the imported reader before notifying managed code, which prevents the native writer from reading through released managed stream state while still avoiding a blocking write FFI call.
79
+
80
+
Synchronous Rust C ABI exports are retained for native ABI compatibility, direct Rust unit coverage, and diagnostics. Managed SDK production paths should prefer the callback exports for operations with meaningful native work.
73
81
74
82
By default, V3 read streams pull each batch through the Arrow C Stream callback and synchronously bridge to the async DataFusion stream. `DeltaTableServiceClientOptions.EnableNativeReadPrefetch` enables an experimental prefetch mode that places a small Rust-owned bounded queue behind the exported Arrow C Stream. In that mode, a Tokio producer task advances the DataFusion stream and sends ready batch results into the queue, while the Arrow C Stream pull side drains queued batches. The queue is bounded per stream, and native read production is guarded by a process-wide active-production limit so full per-stream queues do not monopolize global read capacity.
0 commit comments