Skip to content

fix: RESPECT NULLS for Spark collect_list function#16933

Open
yaooqinn wants to merge 2 commits intofacebookincubator:mainfrom
yaooqinn:feat/spark-collect-list-respect-nulls
Open

fix: RESPECT NULLS for Spark collect_list function#16933
yaooqinn wants to merge 2 commits intofacebookincubator:mainfrom
yaooqinn:feat/spark-collect-list-respect-nulls

Conversation

@yaooqinn
Copy link
Copy Markdown
Contributor

Summary

Follow-up to #16416 (collect_set RESPECT NULLS). Applies the same pattern to collect_list:

  • Remove kSparkCollectListIgnoreNulls QueryConfig
  • Add 2-arg collect_list(x [, ignoreNulls]) signature with constant boolean
  • Use setConstantInputs via CollectListAdapter subclass of SimpleAggregateAdapter
  • Make fn_ protected in SimpleAggregateAdapter to enable subclass access

Changes

  • CollectListAggregate.cpp: Replace config-based with constant-arg approach
  • QueryConfig.h: Remove kSparkCollectListIgnoreNulls
  • SimpleAggregateAdapter.h: Make fn_ protected
  • aggregate.rst: Update doc format
  • SparkAggregationFuzzerTest.cpp: Add to skipFunctions
  • Tests updated to use 2-arg form

Testing

All 8 CollectList tests pass.

Closes #16839

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 26, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 942ad2b
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69cf8a2e3de437000807c63f

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 26, 2026
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move them to a separate PR

@yaooqinn yaooqinn force-pushed the feat/spark-collect-list-respect-nulls branch from 12a2fca to 8aae7ec Compare March 26, 2026 16:08
@yaooqinn
Copy link
Copy Markdown
Contributor Author

Rebased on latest main — removed the unrelated cudf changes from the PR diff. Thanks @jinchengchenghh!

std::vector<DecodedVector> inputDecoded_;
DecodedVector intermediateDecoded_;

protected:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The member order should be protected and then private

@@ -48,8 +49,9 @@ class CollectListAggregate {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the initialize function

Remove kSparkCollectListIgnoreNulls QueryConfig and replace with a
constant boolean argument, matching the collect_set pattern from facebookincubator#16416.

Closes facebookincubator#16839

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yaooqinn yaooqinn force-pushed the feat/spark-collect-list-respect-nulls branch from 8aae7ec to a9de448 Compare March 26, 2026 17:16
@yaooqinn
Copy link
Copy Markdown
Contributor Author

Addressed comments:

  1. Member order — Reordered SimpleAggregateAdapter.h to protected then private (moved fn_ before inputDecoded_/intermediateDecoded_).
  2. Removed empty initialize — The function body was a no-op; ignoreNulls_ defaults to true and gets updated via setConstantInputs.

All 8 tests pass. Thanks @jinchengchenghh!

Copy link
Copy Markdown
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yaooqinn, could you please open a corresponding PR for collect_set and collect_list in Gluten to verify their functionality? We are currently noticing the fallback issue below, and I suppose compatibility code needs to be added due to the signature change.

E20260323 21:52:57.248056  1774 Exceptions.h:87] Line: /work/cpp/velox/substrait/SubstraitToVeloxPlan.cc:290, Function:toAggregationFunctionName, Expression: signatures.has_value() && signatures.value().size() > 0 Cannot find function signature for collect_set_merge_extract_array_row_VARCHAR_BIGINT_BIGINT_endrow in final aggregation step., Source: RUNTIME, ErrorCode: INVALID_STATE
21:52:57.251 WARN org.apache.spark.sql.execution.GlutenFallbackReporter: Validation failed for plan: SortAggregate[QueryId=405], due to: 
 - Native validation failed: 
   |- Validation failed due to exception caught at file:SubstraitToVeloxPlanValidator.cc line:1450 function:validate, thrown from file:SubstraitToVeloxPlan.cc line:290 function:toAggregationFunctionName, reason:Cannot find function signature for collect_set_merge_extract_array_row_VARCHAR_BIGINT_BIGINT_endrow in final aggregation step.

@rui-mo rui-mo changed the title feat(sparksql): Apply same RESPECT NULLS pattern to collect_list feat(sparksql): Fix RESPECT NULLS for Spark collect_list function Mar 26, 2026
@rui-mo rui-mo changed the title feat(sparksql): Fix RESPECT NULLS for Spark collect_list function fix: Fix RESPECT NULLS for Spark collect_list function Mar 26, 2026
@rui-mo rui-mo changed the title fix: Fix RESPECT NULLS for Spark collect_list function fix: RESPECT NULLS for Spark collect_list function Mar 26, 2026
@yaooqinn
Copy link
Copy Markdown
Contributor Author

Created Gluten PR apache/gluten#11837 to test compatibility. It adds ignoreNulls support to VeloxCollectList/VeloxCollectSet and propagates the parameter from Spark's CollectList/CollectSet via reflection (backward-compatible). Thanks @rui-mo!

Copy link
Copy Markdown
Collaborator

@zhli1142015 zhli1142015 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean follow-up to the collect_set RESPECT NULLS pattern (#16416). The approach is consistent and the config removal is thorough. A few minor items — mainly a stale comment referencing the removed config, and a suggestion about the fn_ visibility change in SimpleAggregateAdapter.

@@ -44,14 +45,6 @@ class CollectListAggregate {
// aggregation uses the accumulator path, which correctly respects the config.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment references "config" twice, but ignoreNulls_ is no longer read from QueryConfig — it now comes from the constant boolean argument via setConstantInputs(). Please update the wording, e.g.:

// NOTE: toIntermediate() was intentionally removed because it is static and
// cannot access the runtime ignoreNulls_ flag. Without it, partial
// aggregation uses the accumulator path, which correctly respects the flag.

// Velox registers a 2-arg collect_set(T, boolean) signature that Spark
// doesn't support. The fuzzer may pick this signature and fail.
"collect_set",
// Same as collect_set — 2-arg signature not supported by Spark.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "2-arg signature not supported by Spark" is slightly misleading — Spark 4.0+ does support RESPECT NULLS / IGNORE NULLS for collect_list (SPARK-55256). The real reason for skipping is that the fuzzer can't generate the constant boolean argument. Consider:

// Fuzzer may pick the 2-arg (T, boolean) signature which requires
// a constant boolean that the fuzzer cannot generate.
"collect_list",

Same applies to the collect_set comment above.

}
}

protected:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making fn_ protected exposes the raw unique_ptr to all SimpleAggregateAdapter subclasses across the codebase. Currently only CollectListAdapter needs it. Would a protected accessor like FUNC& fn() { return *fn_; } be a tighter API contract? That way subclasses can access the function object without being able to reset/move the unique_ptr itself.

Not a blocker — just a suggestion for encapsulation.

{"spark_collect_list(c0)"},
expectedResult,
makeConfig(false));
testAggregations({input}, {}, {"spark_collect_list(c0, false)"}, {expected});
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test that verifies the constant boolean false (RESPECT NULLS) works correctly through partial → intermediate → final aggregation stages. testAggregations() does cover multiple modes internally, but an explicit streaming/split test would increase confidence that setConstantInputs() propagates correctly across stages.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

Build Impact Analysis

Selective Build Targets (building these covers all 407 affected)

cmake --build _build/release --target aggregate_companion_functions_test physical_size_aggregator_test presto_sql_test spark_aggregation_fuzzer_test spark_expression_fuzzer_test velox_abfs_test velox_aggregates_GeometryAggregateTest velox_aggregates_reduce_agg_bm velox_aggregates_simple_aggregates_bm velox_aggregates_string_keys_bm velox_aggregates_test_group0 velox_aggregates_test_group1 velox_aggregates_test_group2 velox_aggregates_test_group3 velox_aggregates_test_group4 velox_aggregation_fuzzer_test velox_aggregation_runner_test velox_arrow_bridge_test velox_base_test velox_benchmark_array_writer_no_nulls velox_benchmark_array_writer_with_nulls velox_benchmark_basic_comparison_conjunct velox_benchmark_basic_decoded_vector velox_benchmark_basic_preproc velox_benchmark_basic_selectivity_vector velox_benchmark_basic_simple_arithmetic velox_benchmark_basic_simple_cast velox_benchmark_basic_vector_compare velox_benchmark_basic_vector_fuzzer velox_benchmark_basic_vector_slice velox_benchmark_estimate_flat_size velox_benchmark_expr_flat_no_nulls velox_benchmark_feature_normalization velox_benchmark_map_writer_no_nulls velox_benchmark_map_writer_with_nulls velox_benchmark_nested_array_writer_no_nulls velox_benchmark_nested_array_writer_with_nulls velox_cache_fuzzer velox_cast_benchmark velox_common_compression_test velox_common_geospatial_serde_test velox_common_test velox_connector_registry velox_connector_test velox_constrained_input_generators_test velox_constrained_vector_generator_test velox_core_plan_consistency_checker_test velox_core_test velox_demo_rpc_function_test velox_driver_test velox_duckdb_conversion_test velox_dwio_arrow_parquet_writer_test velox_dwio_cache_test velox_dwio_common_bitpack_decoder_benchmark velox_dwio_common_data_buffer_benchmark velox_dwio_common_int_decoder_benchmark velox_dwio_common_test velox_dwio_dwrf_buffered_output_stream_test velox_dwio_dwrf_byte_rle_encoder_test velox_dwio_dwrf_byte_rle_test velox_dwio_dwrf_checksum_test velox_dwio_dwrf_column_reader_test velox_dwio_dwrf_column_statistics_test velox_dwio_dwrf_compression_test velox_dwio_dwrf_config_test velox_dwio_dwrf_data_buffer_holder_test velox_dwio_dwrf_decompression_test velox_dwio_dwrf_decryption_test velox_dwio_dwrf_dictionary_encoder_test velox_dwio_dwrf_dictionary_encoding_utils_test velox_dwio_dwrf_encoding_selector_test velox_dwio_dwrf_encryption_test velox_dwio_dwrf_flush_policy_test velox_dwio_dwrf_index_builder_test velox_dwio_dwrf_int_direct_test velox_dwio_dwrf_int_encoder_test velox_dwio_dwrf_layout_planner_test velox_dwio_dwrf_ratio_checker_test velox_dwio_dwrf_reader_base_test velox_dwio_dwrf_reader_test velox_dwio_dwrf_rle_test velox_dwio_dwrf_rlev1_encoder_test velox_dwio_dwrf_stream_labels_test velox_dwio_dwrf_stripe_dictionary_cache_test velox_dwio_dwrf_stripe_reader_base_test velox_dwio_dwrf_stripe_stream_test velox_dwio_dwrf_utils_test velox_dwio_dwrf_writer_context_test velox_dwio_dwrf_writer_encoding_manager_test velox_dwio_dwrf_writer_sink_test velox_dwio_dwrf_writer_test velox_dwio_iceberg_reader_benchmark velox_dwio_orc_column_statistics_test velox_dwio_orc_reader_filter_test velox_dwio_orc_reader_test velox_dwio_parquet_common_test velox_dwio_parquet_page_reader_test velox_dwio_parquet_reader_benchmark velox_dwio_parquet_reader_test velox_dwio_parquet_rlebp_decoder_test velox_dwio_parquet_structure_decoder_benchmark velox_dwio_parquet_structure_decoder_test velox_dwio_parquet_table_scan_test velox_dwio_parquet_thrift_test velox_dwio_parquet_tpch_test velox_dwrf_column_writer_index_test velox_dwrf_column_writer_stats_test velox_dwrf_column_writer_test velox_dwrf_e2e_filter_test velox_dwrf_e2e_reader_test velox_dwrf_e2e_writer_test velox_dwrf_float_column_writer_benchmark velox_dwrf_int_encoder_benchmark velox_dwrf_statistics_builder_utils_test velox_dwrf_writer_extended_test velox_dwrf_writer_flush_test velox_example_expression_eval velox_example_opaque_type velox_example_operator_extensibility velox_example_scan_orc velox_example_simple_functions velox_example_vector_reader_writer velox_exchange_benchmark velox_exchange_fuzzer velox_exec_SpatialJoinTest velox_exec_bm_duplicate_project velox_exec_infra_test velox_exec_prefixsort_test velox_exec_test_group0 velox_exec_test_group1 velox_exec_test_group2 velox_exec_test_group3 velox_exec_test_group4 velox_exec_test_group5 velox_exec_test_group6 velox_exec_test_group7 velox_exec_util_test_group0 velox_exec_vector_hasher_benchmark velox_expression_fuzzer_test velox_expression_fuzzer_unit_test velox_expression_runner_test velox_expression_runner_unit_test velox_expression_test velox_expression_verifier_unit_test velox_filemetadata_test velox_filter_project_benchmark velox_format_datetime_benchmark velox_function_dynamic velox_function_dynamic_link_test velox_function_err_dynamic velox_function_non_default_dynamic velox_function_registry_test velox_functions_aggregates_test velox_functions_benchmarks_compare velox_functions_benchmarks_row_writer_no_nulls velox_functions_benchmarks_simdjson_function_with_expr velox_functions_benchmarks_string_writer_no_nulls velox_functions_benchmarks_url velox_functions_iceberg_test velox_functions_json_test velox_functions_lib_test velox_functions_prestosql_benchmarks_array_contains velox_functions_prestosql_benchmarks_array_min_max velox_functions_prestosql_benchmarks_array_position velox_functions_prestosql_benchmarks_array_sum velox_functions_prestosql_benchmarks_bitwise velox_functions_prestosql_benchmarks_cardinality velox_functions_prestosql_benchmarks_comparisons velox_functions_prestosql_benchmarks_concat velox_functions_prestosql_benchmarks_date_time velox_functions_prestosql_benchmarks_field_reference velox_functions_prestosql_benchmarks_generic velox_functions_prestosql_benchmarks_in velox_functions_prestosql_benchmarks_map_input velox_functions_prestosql_benchmarks_map_subscript velox_functions_prestosql_benchmarks_map_zip_with velox_functions_prestosql_benchmarks_not velox_functions_prestosql_benchmarks_regexp_replace velox_functions_prestosql_benchmarks_row velox_functions_prestosql_benchmarks_string_ascii_utf_functions velox_functions_prestosql_benchmarks_uuid_cast velox_functions_prestosql_benchmarks_width_bucket velox_functions_prestosql_benchmarks_zip velox_functions_prestosql_benchmarks_zip_with velox_functions_spark_aggregates_test velox_functions_spark_test velox_functions_test velox_fuzzer_connector_test velox_gcs_file_test velox_gcs_insert_test velox_gcs_multiendpoints_test velox_gcsfile_example velox_hash_benchmark velox_hash_join_build_benchmark velox_hash_join_list_result_benchmark velox_hash_join_prepare_join_table_benchmark velox_hdfs_file_test velox_hdfs_insert_test velox_hive_connector_test velox_hive_iceberg_insert_test velox_hive_iceberg_test velox_hive_paimon_connector velox_hive_paimon_data_file_meta_test velox_hive_paimon_deletion_file_test velox_hive_paimon_row_kind_test velox_hive_paimon_split_test velox_hive_partition_function_benchmark velox_in_10_min_demo velox_join_fuzzer velox_key_encoder_test velox_like_benchmark velox_like_tpch_benchmark velox_mark_distinct_fuzzer velox_mark_sorted_benchmark velox_memory_arbitration_fuzzer velox_memory_test velox_merge_benchmark velox_orderby_benchmark velox_overload_int_function_dynamic velox_overload_varchar_function_dynamic velox_overwrite_int_function_dynamic velox_overwrite_varchar_function_dynamic velox_parquet_e2e_filter_test velox_parquet_writer_sink_test velox_parquet_writer_test velox_parse_test velox_prefixsort_benchmark velox_presto_type_parser_test velox_presto_types_fuzzer_utils_test velox_presto_types_test velox_prestosql_coverage velox_query_replayer velox_re2_functions_benchmarks velox_read_benchmark velox_row_number_fuzzer velox_row_serializer_benchmark velox_row_test velox_rpc_node_test velox_rpc_operator_test velox_s3config_test velox_s3file_test velox_s3finalize_test velox_s3insert_test velox_s3metrics_test velox_s3multiendpoints_test velox_s3read_test velox_s3registration_test velox_serializer_benchmark velox_serializer_test_group0 velox_simple_aggregate_test velox_sort_benchmark velox_spark_function_registry_test velox_spark_query_runner_test velox_spark_types_test velox_spark_windows_test velox_sparksql_benchmarks_cast velox_sparksql_benchmarks_compare velox_sparksql_benchmarks_from_json velox_sparksql_benchmarks_get_funcs velox_sparksql_benchmarks_hash velox_sparksql_benchmarks_in velox_sparksql_benchmarks_simd_compare velox_sparksql_benchmarks_split velox_sparksql_coverage velox_spatial_join_benchmark velox_spatial_join_fuzzer velox_spiller_aggregate_benchmark velox_spiller_join_benchmark velox_streaming_aggregation_benchmark velox_table_evolution_fuzzer_test velox_test_util_test velox_text_reader_test velox_text_writer_test velox_tool_trace_test velox_topn_row_number_fuzzer velox_tpcds_connector_test velox_tpch_benchmark velox_tpch_connector_test velox_tpch_speed_test velox_trace_file_tool velox_unsafe_row_serialize_benchmark velox_vector_fuzzer_test velox_vector_test velox_wave_benchmark velox_wave_exec_test velox_window_fuzzer_test velox_window_prefixsort_benchmark velox_window_sub_partitioned_sort_benchmark velox_windows_agg_test velox_windows_rank_test velox_windows_value_test velox_writer_fuzzer_test

Total affected: 407/556 targets

Warning: 1 file(s) could not be mapped to any target. A full build may be needed.

  • velox/docs/functions/spark/aggregate.rst
Affected targets (407)

Directly changed (296)

Target Changed Files
aggregate_companion_functions_test QueryConfig.h
presto_sql_test QueryConfig.h
spark_aggregation_fuzzer_test QueryConfig.h, SparkAggregationFuzzerTest.cpp
spark_expression_fuzzer_test QueryConfig.h
velox_aggregates QueryConfig.h, SimpleAggregateAdapter.h
velox_aggregates_GeometryAggregateTest QueryConfig.h
velox_aggregates_reduce_agg_bm QueryConfig.h
velox_aggregates_simple_aggregates_bm QueryConfig.h
velox_aggregates_string_keys_bm QueryConfig.h
velox_aggregates_test_group0 QueryConfig.h
velox_aggregates_test_group1 QueryConfig.h
velox_aggregates_test_group2 QueryConfig.h
velox_aggregates_test_group3 QueryConfig.h
velox_aggregates_test_group4 QueryConfig.h
velox_aggregation_fuzzer QueryConfig.h
velox_aggregation_fuzzer_base QueryConfig.h
velox_aggregation_fuzzer_test QueryConfig.h
velox_aggregation_result_verifier QueryConfig.h
velox_aggregation_runner_test QueryConfig.h
velox_arrow_bridge_test QueryConfig.h
velox_async_rpc_function_registry QueryConfig.h
velox_base_test QueryConfig.h
velox_benchmark_array_writer_no_nulls QueryConfig.h
velox_benchmark_array_writer_with_nulls QueryConfig.h
velox_benchmark_basic_comparison_conjunct QueryConfig.h
velox_benchmark_basic_decoded_vector QueryConfig.h
velox_benchmark_basic_preproc QueryConfig.h
velox_benchmark_basic_selectivity_vector QueryConfig.h
velox_benchmark_basic_simple_arithmetic QueryConfig.h
velox_benchmark_basic_simple_cast QueryConfig.h
velox_benchmark_basic_vector_compare QueryConfig.h
velox_benchmark_builder QueryConfig.h
velox_benchmark_estimate_flat_size QueryConfig.h
velox_benchmark_expr_flat_no_nulls QueryConfig.h
velox_benchmark_feature_normalization QueryConfig.h
velox_benchmark_map_writer_no_nulls QueryConfig.h
velox_benchmark_map_writer_with_nulls QueryConfig.h
velox_benchmark_nested_array_writer_no_nulls QueryConfig.h
velox_benchmark_nested_array_writer_with_nulls QueryConfig.h
velox_cast_benchmark QueryConfig.h
velox_connector QueryConfig.h
velox_connector_registry QueryConfig.h
velox_connector_test QueryConfig.h
velox_core QueryConfig.h
velox_core_plan_consistency_checker_test QueryConfig.h
velox_core_test QueryConfig.h
velox_coverage_util QueryConfig.h
velox_cursor QueryConfig.h
velox_demo_rpc_function QueryConfig.h
velox_demo_rpc_function_test QueryConfig.h
velox_driver_test QueryConfig.h
velox_duckdb_conversion_test QueryConfig.h
velox_dwio_arrow_parquet_writer QueryConfig.h
velox_dwio_arrow_parquet_writer_test QueryConfig.h
velox_dwio_common QueryConfig.h
velox_dwio_common_test QueryConfig.h
velox_dwio_common_test_utils QueryConfig.h
velox_dwio_dwrf_reader QueryConfig.h
velox_dwio_dwrf_reader_test QueryConfig.h
velox_dwio_iceberg_reader_benchmark QueryConfig.h
velox_dwio_iceberg_reader_benchmark_lib QueryConfig.h
velox_dwio_native_parquet_reader QueryConfig.h
velox_dwio_orc_reader QueryConfig.h
velox_dwio_orc_reader_filter_test QueryConfig.h
velox_dwio_orc_reader_test QueryConfig.h
velox_dwio_parquet_page_reader_test QueryConfig.h
velox_dwio_parquet_reader QueryConfig.h
velox_dwio_parquet_reader_benchmark QueryConfig.h
velox_dwio_parquet_reader_benchmark_lib QueryConfig.h
velox_dwio_parquet_reader_test QueryConfig.h
velox_dwio_parquet_table_scan_test QueryConfig.h
velox_dwio_parquet_tpch_test QueryConfig.h
velox_dwio_text_reader QueryConfig.h
velox_dwio_text_reader_register QueryConfig.h
velox_dwrf_column_writer_stats_test QueryConfig.h
velox_dwrf_column_writer_test QueryConfig.h
velox_dwrf_e2e_filter_test QueryConfig.h
velox_dwrf_e2e_reader_test QueryConfig.h
velox_dwrf_e2e_writer_test QueryConfig.h
velox_dwrf_test_utils QueryConfig.h
velox_example_expression_eval QueryConfig.h
velox_example_opaque_type QueryConfig.h
velox_example_operator_extensibility QueryConfig.h
velox_example_scan_orc QueryConfig.h
velox_example_simple_functions QueryConfig.h
velox_example_vector_reader_writer QueryConfig.h
velox_exchange_benchmark QueryConfig.h
velox_exchange_fuzzer QueryConfig.h
velox_exec QueryConfig.h, SimpleAggregateAdapter.h
velox_exec_SpatialJoinTest QueryConfig.h
velox_exec_bm_duplicate_project QueryConfig.h
velox_exec_infra_test QueryConfig.h
velox_exec_test_group0 QueryConfig.h
velox_exec_test_group1 QueryConfig.h
velox_exec_test_group2 QueryConfig.h
velox_exec_test_group3 QueryConfig.h
velox_exec_test_group4 QueryConfig.h
velox_exec_test_group5 QueryConfig.h
velox_exec_test_group6 QueryConfig.h
velox_exec_test_group7 QueryConfig.h
velox_exec_test_lib QueryConfig.h
velox_exec_util_test_group0 QueryConfig.h
velox_exec_vector_hasher_benchmark QueryConfig.h
velox_expression QueryConfig.h
velox_expression_fuzzer QueryConfig.h
velox_expression_fuzzer_test QueryConfig.h
velox_expression_fuzzer_unit_test QueryConfig.h
velox_expression_runner QueryConfig.h
velox_expression_runner_test QueryConfig.h
velox_expression_runner_unit_test QueryConfig.h
velox_expression_test QueryConfig.h
velox_expression_test_utility QueryConfig.h
velox_expression_verifier QueryConfig.h
velox_expression_verifier_unit_test QueryConfig.h
velox_filter_project_benchmark QueryConfig.h
velox_format_datetime_benchmark QueryConfig.h
velox_function_dynamic QueryConfig.h
velox_function_dynamic_link_test QueryConfig.h
velox_function_err_dynamic QueryConfig.h
velox_function_non_default_dynamic QueryConfig.h
velox_function_registry QueryConfig.h
velox_function_registry_test QueryConfig.h
velox_functions_aggregates QueryConfig.h
velox_functions_aggregates_test QueryConfig.h
velox_functions_aggregates_test_lib QueryConfig.h
velox_functions_benchmarks_compare QueryConfig.h
velox_functions_benchmarks_row_writer_no_nulls QueryConfig.h
velox_functions_benchmarks_simdjson_function_with_expr QueryConfig.h
velox_functions_benchmarks_string_writer_no_nulls QueryConfig.h
velox_functions_benchmarks_url QueryConfig.h
velox_functions_iceberg QueryConfig.h
velox_functions_iceberg_test QueryConfig.h
velox_functions_lib QueryConfig.h
velox_functions_lib_test QueryConfig.h
velox_functions_prestosql QueryConfig.h
velox_functions_prestosql_benchmarks_array_contains QueryConfig.h
velox_functions_prestosql_benchmarks_array_min_max QueryConfig.h
velox_functions_prestosql_benchmarks_array_position QueryConfig.h
velox_functions_prestosql_benchmarks_array_sum QueryConfig.h
velox_functions_prestosql_benchmarks_bitwise QueryConfig.h
velox_functions_prestosql_benchmarks_cardinality QueryConfig.h
velox_functions_prestosql_benchmarks_comparisons QueryConfig.h
velox_functions_prestosql_benchmarks_concat QueryConfig.h
velox_functions_prestosql_benchmarks_date_time QueryConfig.h
velox_functions_prestosql_benchmarks_field_reference QueryConfig.h
velox_functions_prestosql_benchmarks_generic QueryConfig.h
velox_functions_prestosql_benchmarks_in QueryConfig.h
velox_functions_prestosql_benchmarks_map_input QueryConfig.h
velox_functions_prestosql_benchmarks_map_subscript QueryConfig.h
velox_functions_prestosql_benchmarks_map_zip_with QueryConfig.h
velox_functions_prestosql_benchmarks_not QueryConfig.h
velox_functions_prestosql_benchmarks_regexp_replace QueryConfig.h
velox_functions_prestosql_benchmarks_row QueryConfig.h
velox_functions_prestosql_benchmarks_string_ascii_utf_functions QueryConfig.h
velox_functions_prestosql_benchmarks_uuid_cast QueryConfig.h
velox_functions_prestosql_benchmarks_width_bucket QueryConfig.h
velox_functions_prestosql_benchmarks_zip QueryConfig.h
velox_functions_prestosql_benchmarks_zip_with QueryConfig.h
velox_functions_prestosql_impl QueryConfig.h
velox_functions_spark QueryConfig.h
velox_functions_spark_aggregates CollectListAggregate.cpp, QueryConfig.h, SimpleAggregateAdapter.h
velox_functions_spark_aggregates_test CollectListAggregateTest.cpp, QueryConfig.h, SimpleAggregateAdapter.h
velox_functions_spark_impl QueryConfig.h
velox_functions_spark_specialforms QueryConfig.h
velox_functions_spark_test QueryConfig.h
velox_functions_test QueryConfig.h
velox_functions_test_lib QueryConfig.h
velox_functions_util QueryConfig.h
velox_functions_window QueryConfig.h
velox_functions_window_test_lib QueryConfig.h
velox_fuzzer_connector QueryConfig.h
velox_fuzzer_connector_test QueryConfig.h
velox_fuzzer_util QueryConfig.h
velox_gcs QueryConfig.h
velox_gcs_insert_test QueryConfig.h
velox_gcs_multiendpoints_test QueryConfig.h
velox_hash_benchmark QueryConfig.h
velox_hash_join_build_benchmark QueryConfig.h
velox_hash_join_list_result_benchmark QueryConfig.h
velox_hash_join_prepare_join_table_benchmark QueryConfig.h
velox_hdfs_file_test QueryConfig.h
velox_hdfs_insert_test QueryConfig.h
velox_hive_config QueryConfig.h
velox_hive_connector QueryConfig.h
velox_hive_connector_test QueryConfig.h
velox_hive_iceberg_insert_test QueryConfig.h
velox_hive_iceberg_splitreader QueryConfig.h
velox_hive_iceberg_test QueryConfig.h
velox_hive_paimon_connector QueryConfig.h
velox_hive_paimon_split QueryConfig.h
velox_hive_paimon_split_test QueryConfig.h
velox_hive_partition_function QueryConfig.h
velox_hive_partition_function_benchmark QueryConfig.h
velox_in_10_min_demo QueryConfig.h
velox_is_null_functions QueryConfig.h
velox_join_fuzzer QueryConfig.h
velox_key_encoder QueryConfig.h
velox_key_encoder_test QueryConfig.h
velox_like_benchmark QueryConfig.h
velox_like_tpch_benchmark QueryConfig.h
velox_mark_distinct_fuzzer QueryConfig.h
velox_mark_distinct_fuzzer_lib QueryConfig.h
velox_mark_sorted_benchmark QueryConfig.h
velox_memory_arbitration_fuzzer QueryConfig.h
velox_memory_test QueryConfig.h
velox_merge_benchmark QueryConfig.h
velox_orderby_benchmark QueryConfig.h
velox_overload_int_function_dynamic QueryConfig.h
velox_overload_varchar_function_dynamic QueryConfig.h
velox_overwrite_int_function_dynamic QueryConfig.h
velox_overwrite_varchar_function_dynamic QueryConfig.h
velox_parquet_e2e_filter_test QueryConfig.h
velox_parquet_writer_sink_test QueryConfig.h
velox_parquet_writer_test QueryConfig.h
velox_parse_parser QueryConfig.h
velox_parse_test QueryConfig.h
velox_parse_utils QueryConfig.h
velox_prefixsort_benchmark QueryConfig.h
velox_presto_types QueryConfig.h
velox_presto_types_fuzzer_utils QueryConfig.h
velox_presto_types_fuzzer_utils_test QueryConfig.h
velox_presto_types_test QueryConfig.h
velox_query_benchmark QueryConfig.h
velox_query_replayer QueryConfig.h
velox_query_trace_replayer_base QueryConfig.h
velox_re2_functions_benchmarks QueryConfig.h
velox_row_number_fuzzer QueryConfig.h
velox_row_number_fuzzer_lib QueryConfig.h
velox_rpc_function_stubs QueryConfig.h
velox_rpc_node_test QueryConfig.h
velox_rpc_operator QueryConfig.h
velox_rpc_operator_test QueryConfig.h
velox_rpc_plan_node_translator QueryConfig.h
velox_s3file_test QueryConfig.h
velox_s3insert_test QueryConfig.h
velox_s3metrics_test QueryConfig.h
velox_s3multiendpoints_test QueryConfig.h
velox_s3read_test QueryConfig.h
velox_s3registration_test QueryConfig.h
velox_simple_aggregate QueryConfig.h, SimpleAggregateAdapter.h
velox_simple_aggregate_test QueryConfig.h, SimpleAggregateAdapter.h
velox_sort_benchmark QueryConfig.h
velox_spark_function_registry_test QueryConfig.h
velox_spark_query_runner QueryConfig.h
velox_spark_query_runner_test QueryConfig.h
velox_spark_types QueryConfig.h
velox_spark_types_test QueryConfig.h
velox_spark_windows_test QueryConfig.h
velox_sparksql_benchmarks_cast QueryConfig.h
velox_sparksql_benchmarks_compare QueryConfig.h
velox_sparksql_benchmarks_from_json QueryConfig.h
velox_sparksql_benchmarks_get_funcs QueryConfig.h
velox_sparksql_benchmarks_hash QueryConfig.h
velox_sparksql_benchmarks_in QueryConfig.h
velox_sparksql_benchmarks_simd_compare QueryConfig.h
velox_sparksql_benchmarks_split QueryConfig.h
velox_sparksql_coverage QueryConfig.h
velox_spatial_join_benchmark QueryConfig.h
velox_spatial_join_fuzzer QueryConfig.h
velox_spill_fuzzer_base_lib QueryConfig.h
velox_spiller_aggregate_benchmark QueryConfig.h
velox_spiller_aggregate_benchmark_base QueryConfig.h
velox_spiller_join_benchmark QueryConfig.h
velox_spiller_join_benchmark_base QueryConfig.h
velox_streaming_aggregation_benchmark QueryConfig.h
velox_table_evolution_fuzzer_test QueryConfig.h
velox_text_reader_test QueryConfig.h
velox_text_writer_test QueryConfig.h
velox_tool_trace_test QueryConfig.h
velox_topn_row_number_fuzzer QueryConfig.h
velox_topn_row_number_fuzzer_lib QueryConfig.h
velox_tpcds_connector QueryConfig.h
velox_tpcds_connector_test QueryConfig.h
velox_tpch_benchmark QueryConfig.h
velox_tpch_benchmark_lib QueryConfig.h
velox_tpch_connector QueryConfig.h
velox_tpch_connector_test QueryConfig.h
velox_tpch_speed_test QueryConfig.h
velox_trace QueryConfig.h
velox_wave_benchmark QueryConfig.h
velox_wave_dwio QueryConfig.h
velox_wave_exec QueryConfig.h
velox_wave_exec_test QueryConfig.h
velox_wave_mock_file QueryConfig.h
velox_wave_mock_reader QueryConfig.h
velox_wave_stream QueryConfig.h
velox_window QueryConfig.h
velox_window_fuzzer QueryConfig.h
velox_window_fuzzer_test QueryConfig.h
velox_window_prefixsort_benchmark QueryConfig.h
velox_window_sub_partitioned_sort_benchmark QueryConfig.h
velox_windows_agg_test QueryConfig.h
velox_windows_rank_test QueryConfig.h
velox_windows_value_test QueryConfig.h
velox_writer_fuzzer QueryConfig.h
velox_writer_fuzzer_test QueryConfig.h

Transitively affected (111)

  • physical_size_aggregator_test
  • velox_abfs
  • velox_abfs_test
  • velox_benchmark_basic_vector_fuzzer
  • velox_benchmark_basic_vector_slice
  • velox_cache_fuzzer
  • velox_cache_fuzzer_lib
  • velox_common_compression_test
  • velox_common_geospatial_serde
  • velox_common_geospatial_serde_test
  • velox_common_test
  • velox_constrained_input_generators
  • velox_constrained_input_generators_test
  • velox_constrained_vector_generator
  • velox_constrained_vector_generator_test
  • velox_duckdb_conversion
  • velox_duckdb_parser
  • velox_dwio_arrow_parquet_writer_lib
  • velox_dwio_arrow_parquet_writer_test_lib
  • velox_dwio_arrow_parquet_writer_util_lib
  • velox_dwio_cache_test
  • velox_dwio_common_bitpack_decoder_benchmark
  • velox_dwio_common_compression
  • velox_dwio_common_data_buffer_benchmark
  • velox_dwio_common_int_decoder_benchmark
  • velox_dwio_dwrf_buffered_output_stream_test
  • velox_dwio_dwrf_byte_rle_encoder_test
  • velox_dwio_dwrf_byte_rle_test
  • velox_dwio_dwrf_checksum_test
  • velox_dwio_dwrf_column_reader_test
  • velox_dwio_dwrf_column_statistics_test
  • velox_dwio_dwrf_common
  • velox_dwio_dwrf_compression_test
  • velox_dwio_dwrf_config_test
  • velox_dwio_dwrf_data_buffer_holder_test
  • velox_dwio_dwrf_decompression_test
  • velox_dwio_dwrf_decryption_test
  • velox_dwio_dwrf_dictionary_encoder_test
  • velox_dwio_dwrf_dictionary_encoding_utils_test
  • velox_dwio_dwrf_encoding_selector_test
  • velox_dwio_dwrf_encryption_test
  • velox_dwio_dwrf_flush_policy_test
  • velox_dwio_dwrf_index_builder_test
  • velox_dwio_dwrf_int_direct_test
  • velox_dwio_dwrf_int_encoder_test
  • velox_dwio_dwrf_layout_planner_test
  • velox_dwio_dwrf_ratio_checker_test
  • velox_dwio_dwrf_reader_base_test
  • velox_dwio_dwrf_rle_test
  • velox_dwio_dwrf_rlev1_encoder_test
  • velox_dwio_dwrf_stream_labels_test
  • velox_dwio_dwrf_stripe_dictionary_cache_test
  • velox_dwio_dwrf_stripe_reader_base_test
  • velox_dwio_dwrf_stripe_stream_test
  • velox_dwio_dwrf_utils
  • velox_dwio_dwrf_utils_test
  • velox_dwio_dwrf_writer
  • velox_dwio_dwrf_writer_context_test
  • velox_dwio_dwrf_writer_encoding_manager_test
  • velox_dwio_dwrf_writer_sink_test
  • velox_dwio_dwrf_writer_test
  • velox_dwio_faulty_file_sink
  • velox_dwio_orc_column_statistics_test
  • velox_dwio_parquet_common
  • velox_dwio_parquet_common_test
  • velox_dwio_parquet_rlebp_decoder_test
  • velox_dwio_parquet_structure_decoder_benchmark
  • velox_dwio_parquet_structure_decoder_test
  • velox_dwio_parquet_thrift_test
  • velox_dwio_parquet_writer
  • velox_dwio_text_writer
  • velox_dwio_text_writer_register
  • velox_dwrf_column_writer_index_test
  • velox_dwrf_float_column_writer_benchmark
  • velox_dwrf_int_encoder_benchmark
  • velox_dwrf_statistics_builder_utils_test
  • velox_dwrf_writer_extended_test
  • velox_dwrf_writer_flush_test
  • velox_exec_prefixsort_test
  • velox_expression_fuzzer_test_utility
  • velox_filemetadata_test
  • velox_functions_geo
  • velox_functions_iceberg_hash
  • velox_functions_json
  • velox_functions_json_test
  • velox_functions_spark_window
  • velox_gcs_file_test
  • velox_gcsfile_example
  • velox_hdfs
  • velox_hive_paimon_data_file_meta_test
  • velox_hive_paimon_deletion_file_test
  • velox_hive_paimon_row_kind_test
  • velox_orderby_benchmark_util
  • velox_parse_expression
  • velox_presto_type_parser_test
  • velox_prestosql_coverage
  • velox_read_benchmark
  • velox_row_serializer_benchmark
  • velox_row_test
  • velox_s3config_test
  • velox_s3finalize_test
  • velox_s3fs
  • velox_serializer_benchmark
  • velox_serializer_test_group0
  • velox_test_util_test
  • velox_trace_file_tool
  • velox_trace_file_tool_base
  • velox_unsafe_row_serialize_benchmark
  • velox_vector_fuzzer
  • velox_vector_fuzzer_test
  • velox_vector_test

Fast path • Graph from main@4a966b2effd240f9d9f43e0b9305c0bdd7cd6b39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(sparksql): Apply same RESPECT NULLS pattern to collect_list

4 participants