Skip to content

Fine-grained fast-math flags #1991

Open
@lcw

Description

@lcw

Is your feature request related to a problem? Please describe.

To get kernel performance matching clang we have had to add fast-math flags such as contract (which clang and nvcc do by default). Currently, we do this by an ugly-hack, see for example

# HACK: module-local versions of core arithmetic; needed to get FMA
for (jlf, f) in zip((:+, :*, :-), (:add, :mul, :sub))
for (T, llvmT) in ((:Float32, "float"), (:Float64, "double"))
ir = """
%x = f$f contract nsz $llvmT %0, %1
ret $llvmT %x
"""
@eval begin
# the @pure is necessary so that we can constant propagate.
@inline Base.@pure function $jlf(a::$T, b::$T)
Base.llvmcall($ir, $T, Tuple{$T, $T}, a, b)
end
end
end
@eval function $jlf(args...)
Base.$jlf(args...)
end
end
let (jlf, f) = (:div_arcp, :div)
for (T, llvmT) in ((:Float32, "float"), (:Float64, "double"))
ir = """
%x = f$f fast $llvmT %0, %1
ret $llvmT %x
"""
@eval begin
# the @pure is necessary so that we can constant propagate.
@inline Base.@pure function $jlf(a::$T, b::$T)
Base.llvmcall($ir, $T, Tuple{$T, $T}, a, b)
end
end
end
@eval function $jlf(args...)
Base.$jlf(args...)
end
end
rcp(x) = div_arcp(one(x), x) # still leads to rcp.rn which is also a function call

Describe the solution you'd like

I would like a macro like @fastmath that had fine-grained control over the fast-math flags.

Describe alternatives you've considered

KernelAbstractions used to do this with https://github.com/JuliaLabs/Cassette.jl and other people use macros (although it opens up less optimization and thus not desired) https://github.com/JuliaLabs/Cassette.jl. I don't know if https://github.com/JuliaDebug/CassetteOverlay.jl can be used with kernels but it might be a possible way to implement this.

It would be nice if this functionality eventually got added to base julia.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cuda kernelsStuff about writing CUDA kernels.enhancementNew feature or requestupstreamSomebody else's problem.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions