Description
Is your feature request related to a problem? Please describe.
In 24.08, libcudf received reports of two thread-safety errors. In #16405, we learned that cuco's legacy static_map used a counter that was not thread-safe. In #16426, we learned that a std::map
in a singleton class was not thread-safe.
Describe the solution you'd like
libcudf should have some C++ multi-threaded tests that attempt to catch thread-safety issues before they impact our partners.
In #16426, Spark-RAPIDS encountered segfaults in their test_avro integration test. This test appears to use pyspark to create a local spark session, and then read 50 files over multiple worker threads.
I suggest that we extend recent benchmarking work in multi-threaded read_parquet
, read_orc
, and groupby_max
to design a C++ integration test that we can run in CI or nightlies.
Perhaps we could run the libcudf TPC-H derived examples at a tiny scale factor over multiple threads. That pattern would be useful for studying pipelining as well as possibly shaking out some threading problems.
Describe alternatives you've considered
Continue to rely on libcudf users to report thread-safety issues.
Metadata
Metadata
Assignees
Type
Projects
Status