Skip to content

build: Use FBThrift instead of Apache Thrift (#16019)#16019

Open
peterenescu wants to merge 5 commits intofacebookincubator:mainfrom
peterenescu:export-D90723225
Open

build: Use FBThrift instead of Apache Thrift (#16019)#16019
peterenescu wants to merge 5 commits intofacebookincubator:mainfrom
peterenescu:export-D90723225

Conversation

@peterenescu
Copy link
Copy Markdown
Contributor

@peterenescu peterenescu commented Jan 14, 2026

Summary:

Removes external Thrift dependency primarily used in Parquet as noted by #13175.

Issues noted below for which we'd like to move to FBThrift from Apache Thrift.

FBThrift and Apache Thrift has many incompatible API changes:

  • XXX.__isset.YYY -> XXX.YYY().has_value() (or bool(XXX.YYY())): The latter implicit bool(...) syntax is used in this PR
  • XXX.__isset.YYY = true -> XXX.YYY().ensure()
  • XXX.YYY -> XXX.YYY().value() (or *XXX.YYY()): The latter *... syntax is used in this PR
  • XXX.__set_YYY(value) -> XXX.YYY() = value
  • thrift::ThriftTransport/TCompactProcotolT -> folly::IOBuf/CompactProtocolReader
  • FBThrift doesn't support optional and default value that is used by DataPageHeaderV2.is_compressed: We need to use the default value manually in our code
  • FBThrift raises an exception when we access to an optional field that isn't set (Apache Thrift doesn't have the check)

There are many changes in this PR but there is no logic change and no refactoring. This just converts API usage for FBThrift.

(Copy of #14942 from kou due to some push permission issues.)

Differential Revision: D90723225

@netlify
Copy link
Copy Markdown

netlify bot commented Jan 14, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit d8d9201
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69d020e4d138f800083dd27b

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 14, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Jan 14, 2026

@peterenescu has exported this pull request. If you are a Meta employee, you can view the originating Diff in D90723225.

Copy link
Copy Markdown
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain the rationale a bit more? Looks like we want Arrow to link to fbthrift instead of apache thrift. I suppose fbthrift is compatible with apache thrift for arrow usage?

VELOX_BUILD_SHARED: "ON"
VELOX_ARROW_CMAKE_PATCH: ${{ github.workspace }}/velox/CMake/resolve_dependency_modules/arrow/cmake-compatibility.patch
run: |
if git diff --name-only HEAD^1 HEAD | grep -q "scripts/setup-"; then
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure this is only triggered on a pull_request.
Similar to #15896

# https://github.com/facebook/folly/issues/1666 for it.
install_boost
install_fmt
install_folly
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup script for macOS also installs fast_float just like for Linux. So you probably need that here too.

| wangle | v2025.04.28.00 | No ||
| mvfst | v2025.04.28.00 | No ||
| fbthrift | v2025.04.28.00 | No ||
| folly | v2025.09.15.00 | Yes ||
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to upgrade folly to a later version (2026.01.05.00). We will have a conflict here. The only issue for now is the velox_remote_functions_client_test is failing. I'm investigating why. It shouldn't fail. See #15967

Why did you pick this version (or just because that was the latest at the time you started)?

@majetideepak
Copy link
Copy Markdown
Collaborator

Is this similar to #14942?
I believe the Gluten team saw a regression
See #14942 (comment)

@peterenescu peterenescu force-pushed the export-D90723225 branch 2 times, most recently from 02cb3d7 to 3d90ea7 Compare March 16, 2026 18:49
@rui-mo
Copy link
Copy Markdown
Collaborator

rui-mo commented Mar 17, 2026

@czentgr @peterenescu I wasn’t able to reproduce this issue using a pure scan, but could always reproduce it when running TPC-DS q4. I wonder if you have a way to test the TPC-DS queries. I can upload my test data for reproduction if needed. I assume further tests will be needed to verify the functionality of this PR, as merging it without sufficient tests could be risky.
Here is the stack:

0# std::_Function_handler<std::unique_ptr<folly::IOBuf, std::default_delete<folly::IOBuf> > (unsigned char const*, int, int, int), facebook::velox::parquet::thrift::deserialize<facebook::velox::parquet::thrift::PageHeader>(facebook::velox::parquet::thrift::PageHeader*, facebook::velox::dwio::common::SeekableInputStream*, unsigned char const*, unsigned long)::{lambda(unsigned char const*, int, int, int)#2}>::_M_invoke(std::_Any_data const&, unsigned char const*&&, int&&, int&&, int&&) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 1# apache::thrift::ProtocolReaderWithRefill<apache::thrift::CompactProtocolReader>::ensureBuffer(unsigned int) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 2# 0x00007F94E875F7AA in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 3# 0x00007F94E876732B in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 4# facebook::velox::parquet::thrift::DeserializeResult facebook::velox::parquet::thrift::deserialize<facebook::velox::parquet::thrift::PageHeader>(facebook::velox::parquet::thrift::PageHeader*, facebook::velox::dwio::common::SeekableInputStream*, unsigned char const*, unsigned long) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 5# facebook::velox::parquet::PageReader::readPageHeader() in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 6# facebook::velox::parquet::PageReader::seekToPage(long) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 7# facebook::velox::parquet::PageReader::rowsForPage(facebook::velox::dwio::common::SelectiveColumnReader&, bool, bool, folly::Range<int const*>&, unsigned long const*&) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 8# void facebook::velox::parquet::PageReader::readWithVisitor<facebook::velox::dwio::common::ColumnVisitor<long, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader, false, true> >(facebook::velox::dwio::common::ColumnVisitor<long, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader, false, true>&) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
 9# void facebook::velox::dwio::common::SelectiveIntegerColumnReader::readHelper<facebook::velox::parquet::IntegerColumnReader, facebook::velox::common::AlwaysTrue, false, facebook::velox::dwio::common::ExtractToReader>(facebook::velox::common::Filter const*, folly::Range<int const*> const&, facebook::velox::dwio::common::ExtractToReader) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
10# facebook::velox::parquet::IntegerColumnReader::read(long, folly::Range<int const*> const&, unsigned long const*) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
11# 0x00007F94EA848678 in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
12# facebook::velox::dwio::common::ColumnLoader::loadInternal(folly::Range<int const*>, facebook::velox::ValueHook*, int, std::shared_ptr<facebook::velox::BaseVector>*) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
13# facebook::velox::VectorLoader::load(folly::Range<int const*>, facebook::velox::ValueHook*, int, std::shared_ptr<facebook::velox::BaseVector>*) in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
14# facebook::velox::LazyVector::loadVectorInternal() const in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
15# facebook::velox::LazyVector::loadedVector() in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so
16# gluten::WholeStageResultIterator::next() in /data/tmp/gluten-05c58574-9947-4add-be43-af118e8b3732/jni/993edfeb-984e-49d2-92ec-175011efbd06/gluten-6436771930786767068/linux/amd64/libvelox.so

@rui-mo
Copy link
Copy Markdown
Collaborator

rui-mo commented Mar 17, 2026

@czentgr @peterenescu This issue doesn’t appear to be a corner case. When I tested the first 10 TPC-DS queries, Q4, Q5, and Q9 all failed with the same core dump, while the others haven’t been verified yet. I’m still working on finding a simpler query to reproduce the issue. It would also be very helpful if anyone could help verify the TPC-DS queries. Thanks.

@czentgr
Copy link
Copy Markdown
Collaborator

czentgr commented Mar 17, 2026

@rui-mo Thanks, it is interesting because again it is in the seeking page header logic. It must be hitting another edge case.
This is not coming from a plain table scan but an intermediate result written to disk? I assume you tried already to read the store_sales table as it comes up in all the failing queries (but also in other queries).

@rui-mo
Copy link
Copy Markdown
Collaborator

rui-mo commented Mar 17, 2026

Hi @czentgr @peterenescu, I’m finally able to reproduce this issue in Velox using a simple query plan. Would you like to take a try?
Unit test: https://github.com/rui-mo/velox/commit/093c6d463322319e6a60c6e3c9407140d9897e5a
Test file: https://drive.google.com/file/d/1Rdy_RjNTUWtoh9iHwYZ3eQPRgqUjyFw0/view?usp=share_link

BTW, it appears that removing project and partialAggregation from the unit test still reproduces the core dump.

@@ -118,6 +117,7 @@ jobs:
mkdir /tmp/build
cd /tmp/build
source /opt/rh/gcc-toolset-12/enable
export VELOX_ARROW_CMAKE_PATCH="${GITHUB_WORKSPACE}/CMake/resolve_dependency_modules/arrow/arrow-testing-boost.patch ${GITHUB_WORKSPACE}/CMake/resolve_dependency_modules/arrow/cmake-compatibility.patch"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrow was updated from Arrow 15 to Arrow 18. I see some issues with dependencies of Arrow for testing and boost in the CI. Looks like this patch was supposedly fixing this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. I missed the upgrade.

Yes. It's for fixing arrow-testing and Boost but it's incompleted yet.
I tried fixing it with the patch when I have time but I couldn't fix it entirely. I will retry it when I have time again but I don't have enough time for now. Sorry. If the patch has any trouble, you can remove the patch. I may re-add something when I retry.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kou Yes, we will have to review this.

@czentgr
Copy link
Copy Markdown
Collaborator

czentgr commented Mar 23, 2026

@peterenescu @rui-mo I've been working on a fix for the issue with the Parquet reading part.
@rui-mo Thanks again for the simple repro. What scale factor are you running?

I have a fix for this. I ran a Presto TPCDS SF1k workload and it passed successfully.
It needs some cleanup though to get it ready.

@rui-mo
Copy link
Copy Markdown
Collaborator

rui-mo commented Mar 24, 2026

@czentgr Thanks for verifying TPC-DS. I noticed this issue on a smaller dataset (SF500).

@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Mar 25, 2026

@peterenescu has imported this pull request. If you are a Meta employee, you can view this in D90723225.

@czentgr
Copy link
Copy Markdown
Collaborator

czentgr commented Mar 25, 2026

@rui-mo Would you be able to do another run? I ran on the SF1k and had no problems. I will re-run with SF10k TPCDS.

@rui-mo
Copy link
Copy Markdown
Collaborator

rui-mo commented Mar 26, 2026

Thanks @czentgr for the fix! I’ll run another test to verify it. By the way, I noticed another potential incompatibility between FBThrift and Apache Thrift (see #13175 (comment)) and wonder if there might be other issues.

@majetideepak
Copy link
Copy Markdown
Collaborator

majetideepak commented Mar 27, 2026

@rui-mo There should not be any compatibility issues between fbthrift and thrift since both are expected to have the same on-the-wire byte order (big endian). If at all there is a difference, it's a bug which we can fix. Not moving to fbthrift is a blocker for multiple other Velox improvements.

@rui-mo
Copy link
Copy Markdown
Collaborator

rui-mo commented Mar 27, 2026

@czentgr I verified the previously failed queries as well as all TPC-DS queries (SF500) on my end using your fix, and they are now working correctly. Thank you again for your fix!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

Build Impact Analysis

Selective Build Targets (building these covers all 54 affected)

cmake --build _build/release --target spark_aggregation_fuzzer_test velox_dwio_arrow_parquet_writer_test velox_dwio_parquet_common_test velox_dwio_parquet_page_reader_test velox_dwio_parquet_reader_benchmark velox_dwio_parquet_reader_test velox_dwio_parquet_rlebp_decoder_test velox_dwio_parquet_structure_decoder_benchmark velox_dwio_parquet_structure_decoder_test velox_dwio_parquet_table_scan_test velox_dwio_parquet_tpch_test velox_exec_SpatialJoinTest velox_exec_test_group0 velox_exec_test_group1 velox_exec_test_group2 velox_exec_test_group3 velox_exec_test_group4 velox_exec_test_group5 velox_exec_test_group6 velox_exec_test_group7 velox_gcs_insert_test velox_gcs_multiendpoints_test velox_hdfs_insert_test velox_hive_connector_test velox_hive_iceberg_insert_test velox_hive_iceberg_test velox_parquet_e2e_filter_test velox_parquet_writer_sink_test velox_parquet_writer_test velox_query_replayer velox_s3insert_test velox_s3multiendpoints_test velox_s3read_test velox_s3registration_test velox_sort_benchmark velox_spark_query_runner_test velox_tool_trace_test velox_tpch_benchmark velox_wave_benchmark velox_wave_exec_test

Total affected: 54/555 targets

Warning: 14 file(s) could not be mapped to any target. A full build may be needed.

  • velox/common/dynamic_registry/tests/CMakeLists.txt
  • velox/dwio/parquet/common/CMakeLists.txt
  • velox/dwio/parquet/reader/CMakeLists.txt
  • velox/dwio/parquet/tests/CMakeLists.txt
  • velox/dwio/parquet/tests/common/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/ThriftTransportTest.cpp
  • velox/dwio/parquet/thrift/CMakeLists.txt
  • velox/dwio/parquet/thrift/ParquetThriftTypes.cpp
  • velox/dwio/parquet/thrift/ParquetThriftTypes.h
  • ... and 4 more
Affected targets (54)

Directly changed (21)

Target Changed Files
velox_dwio_arrow_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_arrow_parquet_writer_lib ArrowSchema.cpp, ColumnWriter.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.cpp, Metadata.h, ... (+7 more)
velox_dwio_arrow_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, FileDeserializeTest.cpp, Metadata.h, MetadataTest.cpp, PageIndexTest.cpp, ... (+4 more)
velox_dwio_arrow_parquet_writer_test_lib BloomFilter.cpp, ColumnReader.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, ... (+2 more)
velox_dwio_native_parquet_reader CompactV1ProtocolReaderWithRefill.h, IntegerColumnReader.h, Metadata.cpp, PageReader.cpp, PageReader.h, ... (+9 more)
velox_dwio_parquet_common BloomFilter.cpp, CompactV1ProtocolReaderWithRefill.h, ParquetThrift.h
velox_dwio_parquet_page_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetPageReaderTest.cpp, ParquetThrift.h, ... (+2 more)
velox_dwio_parquet_reader_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_benchmark_lib CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetStatsContext.h, ParquetThrift.h, ... (+3 more)
velox_dwio_parquet_table_scan_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_dwio_parquet_thrift_raw parquet.thrift
velox_dwio_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_exec_test_group1 HashJoinTest.cpp
velox_hive_connector_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_hive_iceberg_splitreader CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_parquet_e2e_filter_test CompactV1ProtocolReaderWithRefill.h, E2EFilterTest.cpp, Metadata.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_sink_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+2 more)
velox_spark_query_runner CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_wave_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h

Transitively affected (33)

  • spark_aggregation_fuzzer_test
  • velox_dwio_parquet_common_test
  • velox_dwio_parquet_reader
  • velox_dwio_parquet_rlebp_decoder_test
  • velox_dwio_parquet_structure_decoder_benchmark
  • velox_dwio_parquet_structure_decoder_test
  • velox_dwio_parquet_tpch_test
  • velox_exec_SpatialJoinTest
  • velox_exec_test_group0
  • velox_exec_test_group2
  • velox_exec_test_group3
  • velox_exec_test_group4
  • velox_exec_test_group5
  • velox_exec_test_group6
  • velox_exec_test_group7
  • velox_gcs_insert_test
  • velox_gcs_multiendpoints_test
  • velox_hdfs_insert_test
  • velox_hive_iceberg_insert_test
  • velox_hive_iceberg_test
  • velox_query_benchmark
  • velox_query_replayer
  • velox_query_trace_replayer_base
  • velox_s3insert_test
  • velox_s3multiendpoints_test
  • velox_s3read_test
  • velox_s3registration_test
  • velox_sort_benchmark
  • velox_spark_query_runner_test
  • velox_tool_trace_test
  • velox_tpch_benchmark
  • velox_tpch_benchmark_lib
  • velox_wave_exec_test

Slow path • Graph generated from PR branch

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

Build Impact Analysis

Selective Build Targets (building these covers all 54 affected)

cmake --build _build/release --target spark_aggregation_fuzzer_test velox_dwio_arrow_parquet_writer_test velox_dwio_parquet_common_test velox_dwio_parquet_page_reader_test velox_dwio_parquet_reader_benchmark velox_dwio_parquet_reader_test velox_dwio_parquet_rlebp_decoder_test velox_dwio_parquet_structure_decoder_benchmark velox_dwio_parquet_structure_decoder_test velox_dwio_parquet_table_scan_test velox_dwio_parquet_tpch_test velox_exec_SpatialJoinTest velox_exec_test_group0 velox_exec_test_group1 velox_exec_test_group2 velox_exec_test_group3 velox_exec_test_group4 velox_exec_test_group5 velox_exec_test_group6 velox_exec_test_group7 velox_gcs_insert_test velox_gcs_multiendpoints_test velox_hdfs_insert_test velox_hive_connector_test velox_hive_iceberg_insert_test velox_hive_iceberg_test velox_parquet_e2e_filter_test velox_parquet_writer_sink_test velox_parquet_writer_test velox_query_replayer velox_s3insert_test velox_s3multiendpoints_test velox_s3read_test velox_s3registration_test velox_sort_benchmark velox_spark_query_runner_test velox_tool_trace_test velox_tpch_benchmark velox_wave_benchmark velox_wave_exec_test

Total affected: 54/555 targets

Warning: 14 file(s) could not be mapped to any target. A full build may be needed.

  • velox/common/dynamic_registry/tests/CMakeLists.txt
  • velox/dwio/parquet/common/CMakeLists.txt
  • velox/dwio/parquet/reader/CMakeLists.txt
  • velox/dwio/parquet/tests/CMakeLists.txt
  • velox/dwio/parquet/tests/common/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/ThriftTransportTest.cpp
  • velox/dwio/parquet/thrift/CMakeLists.txt
  • velox/dwio/parquet/thrift/ParquetThriftTypes.cpp
  • velox/dwio/parquet/thrift/ParquetThriftTypes.h
  • ... and 4 more
Affected targets (54)

Directly changed (21)

Target Changed Files
velox_dwio_arrow_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_arrow_parquet_writer_lib ArrowSchema.cpp, ColumnWriter.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.cpp, Metadata.h, ... (+7 more)
velox_dwio_arrow_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, FileDeserializeTest.cpp, Metadata.h, MetadataTest.cpp, PageIndexTest.cpp, ... (+4 more)
velox_dwio_arrow_parquet_writer_test_lib BloomFilter.cpp, ColumnReader.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, ... (+2 more)
velox_dwio_native_parquet_reader CompactV1ProtocolReaderWithRefill.h, IntegerColumnReader.h, Metadata.cpp, PageReader.cpp, PageReader.h, ... (+9 more)
velox_dwio_parquet_common BloomFilter.cpp, CompactV1ProtocolReaderWithRefill.h, ParquetThrift.h
velox_dwio_parquet_page_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetPageReaderTest.cpp, ParquetThrift.h, ... (+2 more)
velox_dwio_parquet_reader_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_benchmark_lib CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetReaderTest.cpp, ParquetStatsContext.h, ... (+4 more)
velox_dwio_parquet_table_scan_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_dwio_parquet_thrift_raw parquet.thrift
velox_dwio_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_exec_test_group1 HashJoinTest.cpp
velox_hive_connector_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_hive_iceberg_splitreader CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_parquet_e2e_filter_test CompactV1ProtocolReaderWithRefill.h, E2EFilterTest.cpp, Metadata.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_sink_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+2 more)
velox_spark_query_runner CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_wave_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h

Transitively affected (33)

  • spark_aggregation_fuzzer_test
  • velox_dwio_parquet_common_test
  • velox_dwio_parquet_reader
  • velox_dwio_parquet_rlebp_decoder_test
  • velox_dwio_parquet_structure_decoder_benchmark
  • velox_dwio_parquet_structure_decoder_test
  • velox_dwio_parquet_tpch_test
  • velox_exec_SpatialJoinTest
  • velox_exec_test_group0
  • velox_exec_test_group2
  • velox_exec_test_group3
  • velox_exec_test_group4
  • velox_exec_test_group5
  • velox_exec_test_group6
  • velox_exec_test_group7
  • velox_gcs_insert_test
  • velox_gcs_multiendpoints_test
  • velox_hdfs_insert_test
  • velox_hive_iceberg_insert_test
  • velox_hive_iceberg_test
  • velox_query_benchmark
  • velox_query_replayer
  • velox_query_trace_replayer_base
  • velox_s3insert_test
  • velox_s3multiendpoints_test
  • velox_s3read_test
  • velox_s3registration_test
  • velox_sort_benchmark
  • velox_spark_query_runner_test
  • velox_tool_trace_test
  • velox_tpch_benchmark
  • velox_tpch_benchmark_lib
  • velox_wave_exec_test

Slow path • Graph generated from PR branch

2 similar comments
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

Build Impact Analysis

Selective Build Targets (building these covers all 54 affected)

cmake --build _build/release --target spark_aggregation_fuzzer_test velox_dwio_arrow_parquet_writer_test velox_dwio_parquet_common_test velox_dwio_parquet_page_reader_test velox_dwio_parquet_reader_benchmark velox_dwio_parquet_reader_test velox_dwio_parquet_rlebp_decoder_test velox_dwio_parquet_structure_decoder_benchmark velox_dwio_parquet_structure_decoder_test velox_dwio_parquet_table_scan_test velox_dwio_parquet_tpch_test velox_exec_SpatialJoinTest velox_exec_test_group0 velox_exec_test_group1 velox_exec_test_group2 velox_exec_test_group3 velox_exec_test_group4 velox_exec_test_group5 velox_exec_test_group6 velox_exec_test_group7 velox_gcs_insert_test velox_gcs_multiendpoints_test velox_hdfs_insert_test velox_hive_connector_test velox_hive_iceberg_insert_test velox_hive_iceberg_test velox_parquet_e2e_filter_test velox_parquet_writer_sink_test velox_parquet_writer_test velox_query_replayer velox_s3insert_test velox_s3multiendpoints_test velox_s3read_test velox_s3registration_test velox_sort_benchmark velox_spark_query_runner_test velox_tool_trace_test velox_tpch_benchmark velox_wave_benchmark velox_wave_exec_test

Total affected: 54/555 targets

Warning: 14 file(s) could not be mapped to any target. A full build may be needed.

  • velox/common/dynamic_registry/tests/CMakeLists.txt
  • velox/dwio/parquet/common/CMakeLists.txt
  • velox/dwio/parquet/reader/CMakeLists.txt
  • velox/dwio/parquet/tests/CMakeLists.txt
  • velox/dwio/parquet/tests/common/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/ThriftTransportTest.cpp
  • velox/dwio/parquet/thrift/CMakeLists.txt
  • velox/dwio/parquet/thrift/ParquetThriftTypes.cpp
  • velox/dwio/parquet/thrift/ParquetThriftTypes.h
  • ... and 4 more
Affected targets (54)

Directly changed (21)

Target Changed Files
velox_dwio_arrow_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_arrow_parquet_writer_lib ArrowSchema.cpp, ColumnWriter.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.cpp, Metadata.h, ... (+7 more)
velox_dwio_arrow_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, FileDeserializeTest.cpp, Metadata.h, MetadataTest.cpp, PageIndexTest.cpp, ... (+4 more)
velox_dwio_arrow_parquet_writer_test_lib BloomFilter.cpp, ColumnReader.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, ... (+2 more)
velox_dwio_native_parquet_reader CompactV1ProtocolReaderWithRefill.h, IntegerColumnReader.h, Metadata.cpp, PageReader.cpp, PageReader.h, ... (+9 more)
velox_dwio_parquet_common BloomFilter.cpp, CompactV1ProtocolReaderWithRefill.h, ParquetThrift.h
velox_dwio_parquet_page_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetPageReaderTest.cpp, ParquetThrift.h, ... (+2 more)
velox_dwio_parquet_reader_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_benchmark_lib CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetReaderTest.cpp, ParquetStatsContext.h, ... (+4 more)
velox_dwio_parquet_table_scan_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_dwio_parquet_thrift_raw parquet.thrift
velox_dwio_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_exec_test_group1 HashJoinTest.cpp
velox_hive_connector_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_hive_iceberg_splitreader CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_parquet_e2e_filter_test CompactV1ProtocolReaderWithRefill.h, E2EFilterTest.cpp, Metadata.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_sink_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+2 more)
velox_spark_query_runner CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_wave_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h

Transitively affected (33)

  • spark_aggregation_fuzzer_test
  • velox_dwio_parquet_common_test
  • velox_dwio_parquet_reader
  • velox_dwio_parquet_rlebp_decoder_test
  • velox_dwio_parquet_structure_decoder_benchmark
  • velox_dwio_parquet_structure_decoder_test
  • velox_dwio_parquet_tpch_test
  • velox_exec_SpatialJoinTest
  • velox_exec_test_group0
  • velox_exec_test_group2
  • velox_exec_test_group3
  • velox_exec_test_group4
  • velox_exec_test_group5
  • velox_exec_test_group6
  • velox_exec_test_group7
  • velox_gcs_insert_test
  • velox_gcs_multiendpoints_test
  • velox_hdfs_insert_test
  • velox_hive_iceberg_insert_test
  • velox_hive_iceberg_test
  • velox_query_benchmark
  • velox_query_replayer
  • velox_query_trace_replayer_base
  • velox_s3insert_test
  • velox_s3multiendpoints_test
  • velox_s3read_test
  • velox_s3registration_test
  • velox_sort_benchmark
  • velox_spark_query_runner_test
  • velox_tool_trace_test
  • velox_tpch_benchmark
  • velox_tpch_benchmark_lib
  • velox_wave_exec_test

Slow path • Graph generated from PR branch

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Build Impact Analysis

Selective Build Targets (building these covers all 54 affected)

cmake --build _build/release --target spark_aggregation_fuzzer_test velox_dwio_arrow_parquet_writer_test velox_dwio_parquet_common_test velox_dwio_parquet_page_reader_test velox_dwio_parquet_reader_benchmark velox_dwio_parquet_reader_test velox_dwio_parquet_rlebp_decoder_test velox_dwio_parquet_structure_decoder_benchmark velox_dwio_parquet_structure_decoder_test velox_dwio_parquet_table_scan_test velox_dwio_parquet_tpch_test velox_exec_SpatialJoinTest velox_exec_test_group0 velox_exec_test_group1 velox_exec_test_group2 velox_exec_test_group3 velox_exec_test_group4 velox_exec_test_group5 velox_exec_test_group6 velox_exec_test_group7 velox_gcs_insert_test velox_gcs_multiendpoints_test velox_hdfs_insert_test velox_hive_connector_test velox_hive_iceberg_insert_test velox_hive_iceberg_test velox_parquet_e2e_filter_test velox_parquet_writer_sink_test velox_parquet_writer_test velox_query_replayer velox_s3insert_test velox_s3multiendpoints_test velox_s3read_test velox_s3registration_test velox_sort_benchmark velox_spark_query_runner_test velox_tool_trace_test velox_tpch_benchmark velox_wave_benchmark velox_wave_exec_test

Total affected: 54/555 targets

Warning: 14 file(s) could not be mapped to any target. A full build may be needed.

  • velox/common/dynamic_registry/tests/CMakeLists.txt
  • velox/dwio/parquet/common/CMakeLists.txt
  • velox/dwio/parquet/reader/CMakeLists.txt
  • velox/dwio/parquet/tests/CMakeLists.txt
  • velox/dwio/parquet/tests/common/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/ThriftTransportTest.cpp
  • velox/dwio/parquet/thrift/CMakeLists.txt
  • velox/dwio/parquet/thrift/ParquetThriftTypes.cpp
  • velox/dwio/parquet/thrift/ParquetThriftTypes.h
  • ... and 4 more
Affected targets (54)

Directly changed (21)

Target Changed Files
velox_dwio_arrow_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_arrow_parquet_writer_lib ArrowSchema.cpp, ColumnWriter.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.cpp, Metadata.h, ... (+7 more)
velox_dwio_arrow_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, FileDeserializeTest.cpp, Metadata.h, MetadataTest.cpp, PageIndexTest.cpp, ... (+4 more)
velox_dwio_arrow_parquet_writer_test_lib BloomFilter.cpp, ColumnReader.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, ... (+2 more)
velox_dwio_native_parquet_reader CompactV1ProtocolReaderWithRefill.h, IntegerColumnReader.h, Metadata.cpp, PageReader.cpp, PageReader.h, ... (+9 more)
velox_dwio_parquet_common BloomFilter.cpp, CompactV1ProtocolReaderWithRefill.h, ParquetThrift.h
velox_dwio_parquet_page_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetPageReaderTest.cpp, ParquetThrift.h, ... (+2 more)
velox_dwio_parquet_reader_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_benchmark_lib CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetReaderTest.cpp, ParquetStatsContext.h, ... (+4 more)
velox_dwio_parquet_table_scan_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_dwio_parquet_thrift_raw parquet.thrift
velox_dwio_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_exec_test_group1 HashJoinTest.cpp
velox_hive_connector_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_hive_iceberg_splitreader CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_parquet_e2e_filter_test CompactV1ProtocolReaderWithRefill.h, E2EFilterTest.cpp, Metadata.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_sink_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+2 more)
velox_spark_query_runner CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_wave_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h

Transitively affected (33)

  • spark_aggregation_fuzzer_test
  • velox_dwio_parquet_common_test
  • velox_dwio_parquet_reader
  • velox_dwio_parquet_rlebp_decoder_test
  • velox_dwio_parquet_structure_decoder_benchmark
  • velox_dwio_parquet_structure_decoder_test
  • velox_dwio_parquet_tpch_test
  • velox_exec_SpatialJoinTest
  • velox_exec_test_group0
  • velox_exec_test_group2
  • velox_exec_test_group3
  • velox_exec_test_group4
  • velox_exec_test_group5
  • velox_exec_test_group6
  • velox_exec_test_group7
  • velox_gcs_insert_test
  • velox_gcs_multiendpoints_test
  • velox_hdfs_insert_test
  • velox_hive_iceberg_insert_test
  • velox_hive_iceberg_test
  • velox_query_benchmark
  • velox_query_replayer
  • velox_query_trace_replayer_base
  • velox_s3insert_test
  • velox_s3multiendpoints_test
  • velox_s3read_test
  • velox_s3registration_test
  • velox_sort_benchmark
  • velox_spark_query_runner_test
  • velox_tool_trace_test
  • velox_tpch_benchmark
  • velox_tpch_benchmark_lib
  • velox_wave_exec_test

Slow path • Graph generated from PR branch

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

Build Impact Analysis

Selective Build Targets (building these covers all 54 affected)

cmake --build _build/release --target spark_aggregation_fuzzer_test velox_dwio_arrow_parquet_writer_test velox_dwio_parquet_common_test velox_dwio_parquet_page_reader_test velox_dwio_parquet_reader_benchmark velox_dwio_parquet_reader_test velox_dwio_parquet_rlebp_decoder_test velox_dwio_parquet_structure_decoder_benchmark velox_dwio_parquet_structure_decoder_test velox_dwio_parquet_table_scan_test velox_dwio_parquet_tpch_test velox_exec_SpatialJoinTest velox_exec_test_group0 velox_exec_test_group1 velox_exec_test_group2 velox_exec_test_group3 velox_exec_test_group4 velox_exec_test_group5 velox_exec_test_group6 velox_exec_test_group7 velox_gcs_insert_test velox_gcs_multiendpoints_test velox_hdfs_insert_test velox_hive_connector_test velox_hive_iceberg_insert_test velox_hive_iceberg_test velox_parquet_e2e_filter_test velox_parquet_writer_sink_test velox_parquet_writer_test velox_query_replayer velox_s3insert_test velox_s3multiendpoints_test velox_s3read_test velox_s3registration_test velox_sort_benchmark velox_spark_query_runner_test velox_tool_trace_test velox_tpch_benchmark velox_wave_benchmark velox_wave_exec_test

Total affected: 54/556 targets

Warning: 14 file(s) could not be mapped to any target. A full build may be needed.

  • velox/common/dynamic_registry/tests/CMakeLists.txt
  • velox/dwio/parquet/common/CMakeLists.txt
  • velox/dwio/parquet/reader/CMakeLists.txt
  • velox/dwio/parquet/tests/CMakeLists.txt
  • velox/dwio/parquet/tests/common/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/CMakeLists.txt
  • velox/dwio/parquet/tests/thrift/ThriftTransportTest.cpp
  • velox/dwio/parquet/thrift/CMakeLists.txt
  • velox/dwio/parquet/thrift/ParquetThriftTypes.cpp
  • velox/dwio/parquet/thrift/ParquetThriftTypes.h
  • ... and 4 more
Affected targets (54)

Directly changed (20)

Target Changed Files
velox_dwio_arrow_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_arrow_parquet_writer_lib ArrowSchema.cpp, ColumnWriter.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.cpp, Metadata.h, ... (+7 more)
velox_dwio_arrow_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, FileDeserializeTest.cpp, Metadata.h, MetadataTest.cpp, PageIndexTest.cpp, ... (+4 more)
velox_dwio_arrow_parquet_writer_test_lib BloomFilter.cpp, ColumnReader.cpp, CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, ... (+2 more)
velox_dwio_native_parquet_reader CompactV1ProtocolReaderWithRefill.h, IntegerColumnReader.h, Metadata.cpp, PageReader.cpp, PageReader.h, ... (+9 more)
velox_dwio_parquet_common BloomFilter.cpp, CompactV1ProtocolReaderWithRefill.h, ParquetThrift.h
velox_dwio_parquet_page_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetPageReaderTest.cpp, ParquetThrift.h, ... (+2 more)
velox_dwio_parquet_reader_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_benchmark_lib CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_dwio_parquet_reader_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetReaderTest.cpp, ParquetStatsContext.h, ... (+4 more)
velox_dwio_parquet_table_scan_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_dwio_parquet_thrift_raw parquet.thrift
velox_dwio_parquet_writer CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_hive_connector_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_hive_iceberg_splitreader CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_parquet_e2e_filter_test CompactV1ProtocolReaderWithRefill.h, E2EFilterTest.cpp, Metadata.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_sink_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+1 more)
velox_parquet_writer_test CompactV1ProtocolReaderWithRefill.h, Metadata.h, PageReader.h, ParquetThrift.h, ParquetTypeWithId.h, ... (+2 more)
velox_spark_query_runner CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h
velox_wave_benchmark CompactV1ProtocolReaderWithRefill.h, Metadata.h, ParquetThrift.h, Types.h

Transitively affected (34)

  • spark_aggregation_fuzzer_test
  • velox_dwio_parquet_common_test
  • velox_dwio_parquet_reader
  • velox_dwio_parquet_rlebp_decoder_test
  • velox_dwio_parquet_structure_decoder_benchmark
  • velox_dwio_parquet_structure_decoder_test
  • velox_dwio_parquet_tpch_test
  • velox_exec_SpatialJoinTest
  • velox_exec_test_group0
  • velox_exec_test_group1
  • velox_exec_test_group2
  • velox_exec_test_group3
  • velox_exec_test_group4
  • velox_exec_test_group5
  • velox_exec_test_group6
  • velox_exec_test_group7
  • velox_gcs_insert_test
  • velox_gcs_multiendpoints_test
  • velox_hdfs_insert_test
  • velox_hive_iceberg_insert_test
  • velox_hive_iceberg_test
  • velox_query_benchmark
  • velox_query_replayer
  • velox_query_trace_replayer_base
  • velox_s3insert_test
  • velox_s3multiendpoints_test
  • velox_s3read_test
  • velox_s3registration_test
  • velox_sort_benchmark
  • velox_spark_query_runner_test
  • velox_tool_trace_test
  • velox_tpch_benchmark
  • velox_tpch_benchmark_lib
  • velox_wave_exec_test

Slow path • Graph generated from PR branch

mblanco-denodo and others added 5 commits April 3, 2026 13:01
…or#16456)

Summary:
In some cases the file_offset might be 0 if its relative to the end of PAR1.

Tests:
Added new test in ParquetReader.cpp, it tests both the changes in ParquetReader::filterRowGroup and ParquetData::getRowGroupRegion. Run all tests in velox/dwio

Pull Request resolved: facebookincubator#16456

Reviewed By: kKPulla

Differential Revision: D98980070

Pulled By: bikramSingh91

fbshipit-source-id: 58d8aaaad3cd248540d40ae103b4b02dbd969ff7
The original ThriftStreamingTransport used memcpy to copy data into contiguous buffers. The new CompactProtocolReaderWithRefill uses IOBuf chains which can be non-contiguous, causing:

- Data corruption when Thrift reader crossed buffer boundaries
- Incorrect byte counting for stream position tracking
- Buffer management issues after reading page headers

Solution Implemented
1. Buffer Coalescing (ParquetThrift.h)

- Modified refiller to create a single contiguous buffer by copying:
   - Unconsumed bytes from current buffer
   - New data from stream
- This mimics the original memcpy behavior, ensuring Thrift always reads from contiguous memory
- Prevents data corruption when deserializing structures that span buffer boundaries

2. Correct Byte Counting (ParquetThrift.h)

- Track totalBytesReadBeforeRefill (bytes consumed before refiller was called)
- Save coalescedBufferStart and coalescedBufferSize before cloning
- Calculate bytesConsumedFromCoalesced (position in coalesced buffer)
- Total stream bytes = totalBytesReadBeforeRefill + bytesConsumedFromCoalesced
- This gives accurate readBytes for calculating pageDataStart_

3. Proper Buffer Positioning (PageReader.cpp)

- After reading page header with refiller, position bufferStart_/bufferEnd_ to point to remaining data in the stream buffer
- Calculate: bytes consumed from new stream = result.readBytes -
- Set bufferEnd_ to end of stream data
- Set bufferStart_ to skip consumed bytes in stream data
- This allows readBytes() to continue reading page data without additional stream reads

Key Differences from Original
- Original: Used reference parameters automatically updated by memcpy
- New: Uses IOBuf chains with manual coalescing and buffer management
- Both: Ensure Thrift reads from contiguous memory, preventing corruption
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants