AMD support is understated?

I’ve seen a lot of posts being a little down on support for AMD gpus in Julia. There’s no doubt that the CUDA stack is more mature, but the OpenCL stack seems pretty good. Between OpenCL.jl, CLBlast.jl, and ArrayFire.jl, you’re pretty much covered to do scientific computing. After a bit of digging I was able to find these libraries, but I think this level of support is worth praising a little more prominently!

Am I missing something? Or has AMD support “arrived” for Julia?

Thanks
Matt

4 Likes

You might be interested in @jpsamaroo’s awesome AMDGPUnative.jl. IIUC, it is still in active developement and especially the higher level API is not as polished compared to the various CUDA packages. I have tried the alternatives you mention above and a lot of them don’t really seem to be maintained anymore or are quite cumbersome to use. There’s definitely nothing as complete as the CUDA ecosystem yet, but it’s actively being worked on. I personally would also love to see SPIRV support as a more open alternative.

2 Likes

I’m sure folks with less tunnel vision than me will have more to add, but there are still many gaps between the OpenCL-supporting packages:

  • CLArrays is more or less abandoned because of difficulties around codegen. AFAIK custom OpenCL kernels don’t work natively with ArrayFire?
  • ArrayFire lacks many domain-specific methods. For example, no pooling operations or backwards passes for us deep learning folks.

As @simeonschaub mentioned, the up-and-coming AMDGPUnative stack should help immensely with CUDAnative, CuArrays, etc. parity. Unlike ArrayFire, both are able to share common infrastructure (e.g. https://github.com/JuliaGPU/GPUCompiler.jl) as well now!

3 Likes

SPIR-V support appears to be a WIP for bringing up oneAPI (see
https://github.com/JuliaGPU/GPUCompiler.jl/pull/10 and https://github.com/JuliaGPU/oneAPI.jl). Unfortunately, both AMD and NVIDIA refuse to support it in their compute stacks…

3 Likes

The only thing I’ve gotten working is ArrayFire, but only using X as my graphics manager (which is neither my preferred nor default, but it works) for some reason. Benchmarks were good, but agree that there’s some domain-specific stuff missing. Would love to see some more support for AMD, the Julia GPU programming support has caused me to regret not springing for NVIDIA instead.

It’s great to see all the interest here! We’re getting pretty close to having really good support for programming AMD GPUs; the PR to watch is https://github.com/jpsamaroo/ROCArrays.jl/pull/18, which will bring us very close to matching CuArrays in terms of features (not necessarily in terms of stability!). Once that’s merged, I’ll be moving on to add support in the ecosystem, including packages like Flux and projects like DiffEq/SciML. I’ll also be implementing support for KernelAbstractions.jl, which will supersede GPUifyLoops.jl. All of this is funded work and my full-time job, so it’s not a question of if, but when; I’m aiming for good ecosystem support by the end of the summer.

In the meantime, ArrayFire.jl or directly programming kernels through AMDGPUnative.jl is probably the way to go; CLArrays is abandoned and probably won’t be revived without someone funding the effort. OpenCL.jl works pretty well in my experience, and I’m trying to keep it in moderately working order.

14 Likes

Very cool! Yeah, I’ve been watching ROCArrays attentively.

I’m about to purchase a new machine, and I’d like to go with an AMD card so that, between the new machine and my older laptop with a GTX 1060, I’m able to bring in GPU support for both team red and green into my pet project (shameless plug https://github.com/mdav2/JuES.jl) Knowing that someone is being paid to work on this I think solidifies my decision to go with AMD.

1 Like

Also I should note that for my purposes, I basically just need SGEMM and I’m GTG. So my criteria for “support” might be a bit less than people in ML/other fields.

Have you thought about reaching out to AMD for help (maybe they can lend am engineer or something)? It’s in their best interest to have more support for their cards in an up and coming PL. Would flux be the first modern dl framework to have first class AMD GPU support?

1 Like

Thank you for all the great work on ROCArrays and the AMDGPU stack! I would help contribute, but alas am stuck with a card just one generation too old for ROCm (GFX6) :confused:

Also great to hear that this work is being funded, does this mean there are folks looking to use ROCm via Julia in production?

1 Like

rocBLAS supports SGEMM, and we already have bindings for some rocBLAS methods in ROCArrays. Feel free to submit PRs for any methods you need enabled, even if they’re small and/or incomplete.

1 Like

I have considered that option (and do speak to one of them unofficially on IRC), although I’m not sure how useful they would be right now (they’re pretty much laser-focused on C++, and probably don’t employ many/any Julia programmers). Most of the hard work is pretty well understood (since Tim blazed a solid trail with CUDAnative+CuArrays), so I’ve just gotta put in the time to make things work.

Tensorflow and (IIRC) pytorch have at least out-of-tree AMD forks with AMD GPU support, but we’d be among the first.

2 Likes

No worries!

This work is funded by US government grants, probably driven by the fact that the US is building at least one Exascale supercomputer with only AMD GPUs (Frontier). So I’d say that there are going to be plenty of researchers needing this software to do their work. Other than that, I’d say that there are plenty of companies that would be willing to use AMD GPUs, if only software support for them were better.

5 Likes

Fellow TN-based @jpsamaroo, I am willing to try Julia with AMD GPU. Should I start with ROCArrays and its test case folder first? Is a laptop with AMD Radeon™ RX 5600M good to test it out?

1 Like

Hey there! I would start with running AMDGPUnative’s tests, since ROCArrays isn’t yet well-tested or feature complete. It appears that an RX 5600M is a Navi iGPU. Unfortunately, AMD hasn’t yet added Navi support to ROCm, however it’s worth giving it a try to see what happens. For what it’s worth, I currently use an RX 480 and Vega 56, which are officially supported and work very well.

1 Like