You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/user-guide/latest/datasources.md
+8-11Lines changed: 8 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -169,10 +169,11 @@ Or use `spark-shell` with HDFS support as described [above](#building-comet-with
169
169
170
170
## S3
171
171
172
-
The `native_datafusion` and `native_iceberg_compat` Parquet scan implementations completely offload data loading
173
-
to native code. They use the [`object_store` crate](https://crates.io/crates/object_store) to read data from S3 and
174
-
support configuring S3 access using standard [Hadoop S3A configurations](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#General_S3A_Client_configuration) by translating them to
175
-
the `object_store` crate's format.
172
+
Comet's Parquet scan completely offloads data loading to native code. It uses the
173
+
[`object_store` crate](https://crates.io/crates/object_store) to read data from S3 and supports
The S3 support of `native_datafusion` and `native_iceberg_compat` has the following limitations:
253
+
Comet's S3 support has the following limitations:
257
254
258
255
1.**Partial Hadoop S3A configuration support**: Not all Hadoop S3A configurations are currently supported. Only the configurations listed in the tables above are translated and applied to the underlying `object_store` crate.
|`CometScan`| V1 Parquet scan driven by Spark's file-source path through Comet's Parquet reader. Decoding runs in native code; the resulting Arrow batches cross JNI into the native plan. The active scan implementation is shown in brackets, e.g. `CometScan [native_iceberg_compat]`. |
151
-
|`CometBatchScan`| DataSource V2 scan, including Iceberg Parquet, that produces Arrow batches consumed by Comet. |
152
-
|`CometNativeScan`| Fully native Parquet scan that runs entirely in DataFusion (no JVM Parquet reader involvement). |
0 commit comments