When running the cufinufft_chi2 method in batch mode with the following settings:
- batch_size = 10
- N = 1_000_000
- nterms = 2
GPU: 1x A100 40GB
The GPU execution fails with a cupy.cuda.memory.OutOfMemoryError. This occurs during memory allocation, suggesting that the combined memory footprint of the batched inputs exceeds the available device memory.
Expected Behavior:
The computation should avoid out-of-memory errors by automatically splitting large matrix solves into smaller chunks or issuing a clear warning with guidance for selecting a smaller batch size.