Skip to content

Commit 7d9014d

Browse files
committed
clean up readme
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
1 parent 33e3681 commit 7d9014d

3 files changed

Lines changed: 45 additions & 29 deletions

File tree

vortex-cuda/README.md

Lines changed: 13 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -10,35 +10,28 @@ Key files:
1010
- `vortex-cuda/src/arrow/canonical.rs`: canonical-array export to `ArrowDeviceArray`.
1111
- `vortex-test/e2e-cuda/src/lib.rs`: cuDF interop harness.
1212

13-
Current export coverage includes primitive, bool, decimal/temporal, string/binary view, and struct arrays. Remaining work includes null masks, broader dtype coverage, `ArrowDeviceArrayStream`, and PyVortex integration.
13+
## Building cuDF
1414

15-
## cuDF compatibility
15+
Note that the `cudf-test-harness` repository provides prebuilt cuDF libraries for x86_64 and aarch64.
1616

17-
Vortex exports string and binary columns as Arrow `Utf8View` / `BinaryView` device arrays with producer-owned `ArrowArray.private_data`. cuDF string/binary interop requires a build containing `rapidsai/cudf#22620`; until a release version is identified, test with a cuDF commit that includes that change.
18-
19-
## Building cuDF for interop testing
20-
21-
Pass a single CUDA architecture, e.g. `-DCMAKE_CUDA_ARCHITECTURES=90a`; otherwise cuDF builds for many architectures and local builds are much slower.
17+
From the cuDF repository root, compile cuDF locally without exporting additional environment variables:
2218

2319
```sh
24-
export PATH=/usr/local/cuda-13.1/bin:$PATH
20+
cmake -E rm -rf cpp/build
2521

2622
cmake -S cpp -B cpp/build \
27-
-DCMAKE_INSTALL_PREFIX=${CONDA_PREFIX:-/usr/local} \
28-
-DCMAKE_CUDA_ARCHITECTURES=90a \
23+
-DCMAKE_INSTALL_PREFIX=/usr/local \
24+
-DCMAKE_CUDA_ARCHITECTURES=NATIVE \
2925
-DBUILD_TESTS=ON \
3026
-DDISABLE_DEPRECATION_WARNINGS=ON \
3127
-DCMAKE_BUILD_TYPE=Debug \
3228
-DCUDF_BUILD_STREAMS_TEST_UTIL=OFF \
33-
-DCUDAToolkit_ROOT=/usr/local/cuda-13.1 \
34-
-DCMAKE_CUDA_COMPILER=/usr/local/cuda-13.1/bin/nvcc \
35-
-DCMAKE_CXX_COMPILER=/usr/bin/g++-13 \
36-
-DCMAKE_C_COMPILER=/usr/bin/gcc-13 \
37-
-GNinja
38-
39-
cmake --build cpp/build --target INTEROP_TEST -j$(nproc)
40-
41-
LD_LIBRARY_PATH=/usr/local/cuda-13.1/compat:$LD_LIBRARY_PATH ./cpp/build/gtests/INTEROP_TEST
29+
-DCUDAToolkit_ROOT=/usr/local/cuda \
30+
-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \
31+
-DCMAKE_C_COMPILER=gcc \
32+
-DCMAKE_CXX_COMPILER=g++ \
33+
-GNinja && cmake --build cpp/build --target INTEROP_TEST --parallel
4234
```
4335

44-
Adjust architecture, compiler paths, and CUDA paths for the machine under test.
36+
The clean build directory is important when changing compilers because CMake caches the selected C and C++
37+
compilers in `CMakeCache.txt`.

vortex-cuda/src/arrow/canonical.rs

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -448,11 +448,9 @@ where
448448

449449
/// Export Vortex binary views as standard Arrow `Binary`.
450450
///
451-
/// cuDF's Arrow Device import currently rejects both Arrow `Binary` and Arrow `BinaryView`
452-
/// (unlike `Utf8View`, which it accepts as strings), so standard `Binary` is exported as the
453-
/// layout other Arrow Device consumers accept most widely. This path keeps conversion on the
454-
/// CUDA stream by building `i32` offsets from view sizes and gathering inline/out-of-line view
455-
/// bytes into one contiguous values buffer.
451+
/// cuDF currently rejects Arrow Device `Binary`/`BinaryView`, but `Binary` is the widest
452+
/// compatible layout for other consumers. Conversion stays on the CUDA stream by building
453+
/// `i32` offsets and gathering inline/out-of-line bytes into one values buffer.
456454
async fn export_binary(
457455
varbinview: VarBinViewArray,
458456
ctx: &mut CudaExecutionCtx,
@@ -682,10 +680,9 @@ fn gather_binary_values(
682680
///
683681
/// Returns `None` for the buffer when Arrow can omit validity because all rows are valid.
684682
///
685-
/// Every returned buffer is backed by an allocation padded to a 4-byte multiple with zeroed
686-
/// padding so cuDF's word-sized mask reads stay in bounds: the fast path through the device
687-
/// copy's tail zeroing, the other paths through their own padded allocations. Bits at positions
688-
/// `>= len + arrow_offset` within the final data byte are unspecified, as Arrow permits.
683+
/// Returned buffers use zeroed 4-byte padding so cuDF's word-sized mask reads stay in bounds.
684+
/// Bits at positions `>= len + arrow_offset` within the final data byte are unspecified, as
685+
/// Arrow permits.
689686
pub(super) async fn export_arrow_validity_buffer(
690687
validity: Validity,
691688
len: usize,

vortex-cuda/src/dynamic_dispatch/plan_builder.rs

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -894,3 +894,29 @@ impl FusedPlan {
894894
len * final_elem_bytes.max(output_elem_bytes)
895895
}
896896
}
897+
898+
#[cfg(test)]
899+
mod tests {
900+
use vortex::array::IntoArray;
901+
use vortex::array::arrays::PrimitiveArray;
902+
use vortex::array::builtins::ArrayBuiltins;
903+
use vortex::dtype::DType;
904+
use vortex::dtype::Nullability;
905+
906+
use super::*;
907+
908+
#[test]
909+
fn cast_to_non_primitive_target_is_not_dyn_dispatch_compatible() -> VortexResult<()> {
910+
let cast = PrimitiveArray::from_iter([0u8, 1])
911+
.into_array()
912+
.cast(DType::Bool(Nullability::NonNullable))?;
913+
914+
assert!(!is_dyn_dispatch_cast_compatible(&cast));
915+
assert!(matches!(
916+
DispatchPlan::new(&cast, CudaDispatchMode::DynDispatchOnly)?,
917+
DispatchPlan::Unfused
918+
));
919+
920+
Ok(())
921+
}
922+
}

0 commit comments

Comments
 (0)