Skip to content

Commit 35f0f4e

Browse files
Subham SinghalSubham Singhal
authored andcommitted
Fix formatting
1 parent 111a87e commit 35f0f4e

3 files changed

Lines changed: 133 additions & 146 deletions

File tree

datafusion/physical-expr/src/utils/mod.rs

Lines changed: 3 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -275,13 +275,6 @@ fn get_field_id(field: &arrow::datatypes::Field) -> Option<i32> {
275275
}
276276

277277
/// Find field index by field ID with fallback to name-based matching
278-
///
279-
/// # Limitations
280-
///
281-
/// TODO: Currently only supports flat schemas. For nested schemas, this function
282-
/// would need to accept a field path (e.g., ["address", "city"]) and return
283-
/// a path of indices. This requires matching nested field IDs at each level
284-
/// of the schema hierarchy.
285278
fn find_field_index(
286279
column_name: &str,
287280
source_schema: &Schema,
@@ -293,8 +286,6 @@ fn find_field_index(
293286
// Check if field has a field ID
294287
if let Some(source_field_id) = get_field_id(source_field) {
295288
// Search target schema for matching field ID
296-
// TODO: For nested schemas, this needs to recursively match field IDs
297-
// through the struct hierarchy
298289
for (idx, target_field) in target_schema.fields().iter().enumerate() {
299290
if let Some(target_field_id) = get_field_id(target_field)
300291
&& source_field_id == target_field_id
@@ -322,13 +313,9 @@ fn find_field_index(
322313
///
323314
/// # Limitations
324315
///
325-
/// TODO: Currently only supports flat schemas (top-level columns). Nested field
326-
/// references (e.g., "address.city") are not yet supported. Supporting nested
327-
/// fields would require:
328-
/// - Path-based field ID matching through struct hierarchies
329-
/// - Recursive traversal of both expression tree and schema tree
330-
/// - Updates to Column representation to track nested paths
331-
///
316+
/// Currently only supports flat schemas (top-level columns). Nested field
317+
/// references (e.g., "address.city") are not yet supported.
318+
/// For nested schema see: (<https://github.com/apache/datafusion/issues/20475>)
332319
/// # Errors
333320
///
334321
/// This function will return an error if any column in the expression cannot be found

datafusion/sqllogictest/test_files/information_schema.slt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,7 @@ datafusion.execution.parquet.dictionary_enabled true (writing) Sets if dictionar
383383
datafusion.execution.parquet.dictionary_page_size_limit 1048576 (writing) Sets best effort maximum dictionary page size, in bytes
384384
datafusion.execution.parquet.enable_page_index true (reading) If true, reads the Parquet data page level metadata (the Page Index), if present, to reduce the I/O and number of rows decoded.
385385
datafusion.execution.parquet.encoding NULL (writing) Sets default encoding for any column. Valid values are: plain, plain_dictionary, rle, bit_packed, delta_binary_packed, delta_length_byte_array, delta_byte_array, rle_dictionary, and byte_stream_split. These values are not case sensitive. If NULL, uses default parquet writer setting
386-
datafusion.execution.parquet.field_id_read_enabled false (reading) If true, use Parquet field IDs for column resolution instead of column names. This enables schema evolution with renamed/reordered columns. When field IDs are unavailable, falls back to name-based matching.
386+
datafusion.execution.parquet.field_id_enabled false (reading) If true, use Parquet field IDs for column resolution instead of column names. This enables schema evolution with renamed/reordered columns. When field IDs are unavailable, falls back to name-based matching.
387387
datafusion.execution.parquet.force_filter_selections false (reading) Force the use of RowSelections for filter results, when pushdown_filters is enabled. If false, the reader will automatically choose between a RowSelection and a Bitmap based on the number and pattern of selected rows.
388388
datafusion.execution.parquet.max_predicate_cache_size NULL (reading) The maximum predicate cache size, in bytes. When `pushdown_filters` is enabled, sets the maximum memory used to cache the results of predicate evaluation between filter evaluation and output generation. Decreasing this value will reduce memory usage, but may increase IO and CPU usage. None means use the default parquet reader setting. 0 means no caching.
389389
datafusion.execution.parquet.max_row_group_size 1048576 (writing) Target maximum number of rows in each row group (defaults to 1M rows). Writing larger row groups requires more memory to write, but can get better compression and be faster to read.

0 commit comments

Comments
 (0)