CUDA.jl kernel is half as fast as c++ Kernel

fft · September 23, 2022, 2:25am

Thanks for the suggestions. I’ll rework it and see waht I can do. With regards to In64 types when doing loops such as

for tr_yi = 1:size(tr_array, 3)

tr_yi will be 64 bit. For loops like this is there a straightforward way to control the index type? including eachindex() and CartesianIndices()?

thank you.

Topic		Replies	Views
Julia vs C++ speed General Usage performance , c	21	4769	September 2, 2021
Cosine seems slow Performance	14	1878	November 27, 2019
Why is my kernel as slow in FP32 as in FP64 on A2000 Ada-based GPU? New to Julia gpu , cuda , float , kernel , cudajl	10	269	March 11, 2025
Trying to understand low performance compared to C++ Performance	13	425	October 2, 2024
Trig functions very slow Performance	67	7178	October 10, 2018