Skip to content

[cufinufft] fold_rescale/interval fails in for float32 inputs on large 1D (~1e8) uniform grids #701

@blackwer

Description

@blackwer

While working on the review for PR #663, I came into a segfault bug occurring in index calculation from the combination of fold_rescale with interval. This happens roughly when N has more digits than a floating point number can accurately represent (~7 digits for float32, ~15 digits for float64). So the segfault is really only practically valid in 1D when using float32, though I think the issue is more broad than that. This bug is not only present in the new "output driven" method, though doesn't produce a segfault in the other methods, just spreads to the wrong(?) points. Looking at the spread_1d_nuptsdriven code (all comments are mine)

    const auto x_rescaled     = fold_rescale(x[idxnupts[i]], nf1); // float32 of O(nf1), possible loss of precision)
    const auto cnow           = c[idxnupts[i]];
    const auto [xstart, xend] = interval(ns, x_rescaled); // last digits of xstart possibly wrong
    const T x1                = (T)xstart - x_rescaled;
    if constexpr (KEREVALMETH == 1)
      eval_kernel_vec_horner<T, ns>(ker1, x1, sigma);
    else
      eval_kernel_vec<T, ns>(ker1, x1, es_c, es_beta);

    for (auto xx = xstart; xx <= xend; xx++) {
      // segfault averted by wrapping all values of xx, though density spreads to the wrong gridpoints
      auto ix          = xx < 0 ? xx + nf1 : (xx > nf1 - 1 ? xx - nf1 : xx); 
      const T kervalue = ker1[xx - xstart];
      atomicAdd(&fw[ix].x, cnow.x * kervalue);
      atomicAdd(&fw[ix].y, cnow.y * kervalue);
    }

Similar safety wrappers are written on every other fold_rescale operation to prevent illegal memory address access and properly handle periodic boundary issues. I think it's reasonable to ask if this is actually a bug or a feature, since ~100 million grid points is... a lot, and I'm unsure if spreading to slightly the wrong(?) grid points will do anything actually noticeable. That said, as written, what is normally the infrequent off-by-one errors in spreading can be considerably amplified to off-by-100 errors if the user requests large uniform grids for whatever reason.

Remediation:

  • upcast to double?
    • I think this is almost equally wrong for low N -- it's just adding random digits to our float, but the added intermediate precision should make index-finding slightly more accurate. If my thinking is clear, it should at least fix the issue where large N leads to very wrong index calculation.
    • I have verified that somewhat frequently, even for low N, upcasting to double, not surprisingly, does give different index calculations
  • Tell the user "don't do that" -- probably mathematically justified, but I'm not the expert here
  • I don't know. Just need to bring this to discussion since it will give negative indices in the output-driven method as currently written, and it's worth considering how to handle these issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions