Skip to content

Make GPU NDT covariance updates atomic under contention#9

Open
fredfrog78 wants to merge 2 commits intomasterfrom
codex/ndt-atomic-covariance-update
Open

Make GPU NDT covariance updates atomic under contention#9
fredfrog78 wants to merge 2 commits intomasterfrom
codex/ndt-atomic-covariance-update

Conversation

@fredfrog78
Copy link

@fredfrog78 fredfrog78 commented Feb 27, 2026

Summary

  • harden covarianceHitNdt so only one thread owns each voxel update, including an ungrouped-sample fallback path
  • pass a grouped_samples kernel argument from GpuNdtMap so the kernel can choose fast grouped traversal vs safe ungrouped ownership scan
  • add regression coverage for ungrouped contention (Ndt.HitCountUngrouped)
  • include portability/build fixes encountered while completing test execution in this environment

Files changed

  • ohmgpu/gpu/CovarianceHitNdt.cl
  • ohmgpu/GpuNdtMap.cpp
  • tests/ohmtestgpu/GpuNdtTests.cpp
  • cmake/Findglm.cmake
  • ohm/CovarianceVoxel.h
  • ohmgpu/private/GpuMapDetail.cpp

Test results run locally (Apple M3 Pro, OpenCL 1.2)

  • cmake -S . -B build3 -DOHM_FEATURE_TEST=ON -DOHM_SYSTEM_GTEST=ON
  • cmake --build build3 --target ohmtestocl -j8
  • ./build3/bin/ohmtestocl '--gtest_filter=Ndt.*'
  • ❌ full ./build3/bin/ohmtestocl has 2 failures:
    • Tsdf.Basic
    • Tsdf.Truncation

These TSDF failures are due to the local OpenCL runtime limitation (cl2Metal / OpenCL 1.2 lacks required 64-bit atomics), not this NDT change.

Maintainer request before merge

Please run maintainer validation on a supported GPU/runtime (e.g. CUDA target and/or OpenCL 2.0+ environment) before merge, including:

  • ohmtestocl (or equivalent supported OpenCL target)
  • ohmtestcuda where available

@fredfrog78
Copy link
Author

Closing for now per review request: will resubmit after dependencies are fixed and tests pass end-to-end.

@fredfrog78 fredfrog78 closed this Feb 27, 2026
@fredfrog78 fredfrog78 reopened this Feb 27, 2026
@fredfrog78
Copy link
Author

Maintainer note:
Local full ohmtestocl run is blocked by platform-specific TSDF OpenCL 1.2 limitations (Tsdf.Basic and Tsdf.Truncation).

Please run maintainer-side GPU/runtime validation before merge on supported targets (e.g. CUDA and/or OpenCL 2.0+ environments).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant