CUDA.jl kernel is half as fast as c++ Kernel

Check out the trunc docs, it’s safe, so it throws in case the correct result isn’t possible:
https://docs.julialang.org/en/v1/base/math/#Base.trunc

Maybe you want unsafe_trunc instead.

1 Like