I just registered and tagged v0.1 of CUDA.jl, a package that now contains the documentation and functionality from the several packages that make up the Julia/CUDA stack. I merged them because there was too much coupling between the individual packages, and it wasnβt particularly user-friendly to split functionality thatβs often used together across several packages.
Apart from some major test suite changes, the functionality should be a 1:1 copy of what used to be in CuArrays/CUDAnative/etc, so Iβd quickly like to tag a v1.0 release. Before that, Iβd like to make sure there arenβt any obvious issues with application code, so please try your CUDA applications with CUDA.jl and let me know!
The upgrade procedure should be straightforward: remove CuArrays/CUDAnative/CUDAdrv/CUDAapi, add CUDA.jl, and replace module imports/references.
That makes sense so maybe the CUDA.jl README needs updating as it says βThe package is tested against, and being developed for, Julia 1.3 and above.β
The new CUDA.jl looks great. Last time I tried CuArrays, it did not support an older version of the Nvidia driver on a server I was using (which I could not update because I was not admin). But now it seems to run fine!
Just to confirm, have there been any changes enabling support for older driver versions?
Damn case sensitivityβ¦ add cuda.jl β¦noβ¦ add Cuda.jl β¦ no β¦ CUDA.jl β¦yesβ¦
One small plea⦠this has irritated me when installing other packages.
I guess there is not much to be done about it - having cuda and Cuda point toward CUDA as some sort of alias probably would lead to confusion.
Reporting back - installs OK on Windows 10/Julia 1.4.1 but I have no Nvidia card.
If I find time later I will test on Jetson Nano
Nvidia Visual Profiler and nvprof donβt support profiling on Tegra devices (like Jetson TX2) for non-root users. The only workaround is to start the profiling session as a root user.
Fantastic work, @maleadt! I think this is a good idea. For what itβs worth, I ran the test suite on the current tagged release, Julia 1.4.1 using a RTX 2060 on CUDA driver 10.2.0 and toolkit 10.2.0.
Thanks @maleadt; this looks like a great improvement in terms of user-friendlyness! I believe it is nice to have just one package
I have a little question on how the packages were merged. Are there symbol names which appear in more than one of the packages CuArrrays/CUDAnative/CUDAdrv, but are not identical? If so, how did you merge them? E.g. is the method CUDAnative.synchronize identical to the method of CUDAdrv.synchronize without arguments? If not, what did go into CUDA.sychronize()?
CUDAnative/CUDAdrv
help?> CUDAnative.synchronize
synchronize()
Wait for the device to finish. This is the device side version, and should not be called from the host.
synchronize acts as a synchronization point for child grids in the context of dynamic parallelism.
help?> CUDAdrv.synchronize
synchronize()
Block for the current context's tasks to complete.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
synchronize(s::CuStream)
Wait until a stream's tasks are completed.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
synchronize(e::CuEvent)
Waits for an event to complete.
CUDA
help?> CUDA.synchronize
synchronize()
Block for the current context's tasks to complete.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
synchronize(s::CuStream)
Wait until a stream's tasks are completed.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
synchronize(e::CuEvent)
Waits for an event to complete.
I renamed device-side synchronization to device_synchronize. Itβs a tough call, though. I considered adding a submodule to contain all device functionality so that we could retain the name, e.g. Device.synchronize, but that would be too breaking. In all other cases, I kept the previous name, even though that now mixes host and device functionality in a single package (making it easy to, e.g., crash Julia by inadvertently calling GPU code from the CPU). I hope that weβll soon be able to use contextual dispatch to have those methods error when called from the wrong context.