CUDAnative is awesome!

maleadt · November 22, 2018, 2:10pm

Great, happy to hear CUDAnative has been useful and performs well

I have also appreciated to use CuArrays and broadcast but I did not obtain the same level of performance (probably due to my limited skills).

Probably not; I’ve spent significant time optimizing CUDAnative and making sure the generated code quality is competitive, while CuArrays hasn’t seen much optimization…

I wonder if it would be a good idea to allow to run the CUDANative kernels on the CPU (possibly with automatic multi-threading (and maybe simd)) when Base.Arrays are used instead of CuArrays. Of course optimal CPU implementations would probably imply data layout transformation I wrote a small paper on this topic here.

I considered that in the past, but I’m not sure it’s a smart investment of our (very limited) development manpower, especially with most users nowadays relying on array abstractions where this already the case. If you’re really interested in this, it might be better to revive a project like GPUOcelot, which implements the CUDA APIs and provides a PTX->LLVM IR compiler. But it hasn’t seen any development recently.

Of course all the CUDA extensions (atomic, GPU intrinsics,…) would be a very much appreciated

Similar trade off, I prefer to work on CUDAnative but it improving CuArrays reaches more users.

Topic		Replies	Views
CUDAnative: register host memory for pinned memory access GPU question	26	4099	September 3, 2021
cuArrays vs CUDANative GPU	3	1362	November 14, 2018
Tutorial on GPU programming on julia GPU	5	5885	March 19, 2019
Package use, CUDA stream support, etc GPU first-steps	5	1461	September 13, 2018
CUDAnative tests failing on Windows 10 GPU question	5	978	May 23, 2019

CUDAnative is awesome!

Related topics