Commit 42123b3
Fix float64 precision loss in sparse Pearson residual kernels (#658)
* Fix float64 precision loss in sparse Pearson residual kernels
The CSR and CSC Pearson residual kernels in _cuda/pr/kernels_pr.cuh
divided by `sqrtf`, the single-precision square-root intrinsic. Because
both kernels are templated on the element type `T`, a `T=double`
instantiation silently narrowed the variance term
`mu + mu * mu * inv_theta` to float32, evaluated the square root at
single precision, and promoted the result back to double. The float64
path of `pp.normalize_pearson_residuals` (and `pp.highly_variable_genes`
with `flavor='pearson_residuals'`) was therefore capped at ~7
significant digits regardless of the requested dtype. The dense kernel
`dense_norm_res_kernel` already used the overloaded `sqrt` and was
unaffected.
Replace `sqrtf` with the overloaded `sqrt` on both sparse paths. `sqrt`
dispatches to the single-precision root for `T=float` and the
double-precision root for `T=double`, so the float32 path is
byte-identical to before and only the float64 path changes.
Hardware verification (NVIDIA H100 80GB HBM3, CUDA 12.6, sm_90):
A standalone harness compiled the real `sparse_norm_res_csr_kernel`
verbatim and ran it on a 4000x4000 synthetic CSR count matrix against a
host float64 reference.
T=double, before fix: max relative error 8.83e-08 (~7.1 digits)
T=double, after fix: max relative error 3.97e-16 (~15.4 digits)
T=float, before/after: bit-identical (max abs diff 0.0)
The float64 path is now ~8 orders of magnitude more accurate; the
float32 path is provably unchanged.
Add `test_normalize_pearson_residuals_float64_precision` to
tests/test_normalization.py. It pins the float64 CSR/CSC output to an
analytic float64 reference at rtol/atol 1e-9 -- tight enough to fail on
a single-precision result and pass on a genuine float64 one -- across
theta in {100, inf}.
* add PR number
* switch to rsqrt
---------
Co-authored-by: Intron7 <severin.dicks@icloud.com>
Co-authored-by: Severin Dicks <37635888+Intron7@users.noreply.github.com>1 parent 61eda66 commit 42123b3
3 files changed
Lines changed: 43 additions & 2 deletions
File tree
- docs/release-notes
- src/rapids_singlecell/_cuda/pr
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
| 56 | + | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
93 | 133 | | |
94 | 134 | | |
95 | 135 | | |
| |||
0 commit comments