Flux with AMD GPU(s)?

I can order that 6700 even if I doubt will be able to help so much, except if the community is so thin will make some difference (happy to).
My doubt is if Flux so far is not able to select AMDGPU at all will be with no use.

Flux support is basically there, we (probably) just need to wire up NNlib with NNlibROC.jl as a weak dependency (new in Julia 1.9, should be present in RC2) so that loading Flux and AMDGPU causes the right support code to be loaded (@ToucheSir is working on Re-integrate NNlibCUDA as a package extension by ToucheSir ¡ Pull Request #445 ¡ FluxML/NNlib.jl ¡ GitHub which is the equivalent work for NNlibCUDA.jl).

FYI, @pxl-th and I test on RX 6700s, so we can reproduce any bugs or performance issues you might find (and we’ve already squashed quite a few Navi2-related bugs in just the past 2 months). So far, we haven’t found any show-stopping bugs specific to Navi2, so it’s probably a good purchase if you’re looking for a non-Instinct AMD card.

7 Likes

Ok I count to have it next week and let’s see how I can help.

2 Likes

To add onto what Julian said, the biggest missing piece for running common DL models on AMDGPU is wrapping MIOpen functions for pooling. Once that’s done, my hope is we’ll have the NNlib CUDA extension working and integrating NNlibROC should be trivial. No guarantee anything will be bug-free, but given I also have a 6700 XT there is plenty of desire to get this tested and working soon.

2 Likes

It doesn’t support conda and limited to Linux (As it is based on ROCm).
The idea was making giving something which is not achievable in other frameworks / languages.

It might be an herculean effort, but it also should bring the eco system to the front, as a leader and not just chasing others.

1 Like

It doesn’t support conda

With Julia there are now artifacts for most of the ROCm stuff, including MIOpen, so you can just ]add MIOpen_jll and start using it.
I never manually install any of the ROCm packages and rely only artifacts (same for CUDA).

2 Likes

Just to tell it’s all shifted to beginning next year. In the meantime will learn on Flux so will be able to contribute a bit more.

1 Like

Hello everyone,

Apologies for the (necro?)bump, but it seems that this is the most recent discussion there has been on Flux for AMD GPUs via ROCm. I own an RX 6600 and I’ve been following AMDGPU.jl for a while now (thanks to this thread; I currently moved to Julia 1.9 to further give it some exercise), so if I may politely ask: how might I go about setting up a stack to test ROCm with?

I have a couple of projects I would like to try working through purely in Julia, but the only question mark is whether I can use Flux with AMDGPU.jl or not. One of the specific questions is would Flux automatically understand that I’m working with AMD silicon, such that all I need use is something like @cuda (similar to PyTorch)?

I would love to try to test this as much as possible and I read that @jpsamaroo has been testing Flux on RX 6700s, but I’m unsure about how to set up a usage pipeline for myself. Thank you in advance for your time!

Edit: I have Julia 1.9, AMDGPU.jl, Flux.jl and a ROCm stack installed and working on Arch Linux 6.1.9. Should I also perhaps ask the same question as an issue over at Flux’s GitHub repository?

1 Like

We’re currently at the “integrating NNlibROC” step mentioned above. If you want to follow this progress, check out Add AMDGPU extension by pxl-th · Pull Request #470 · FluxML/NNlib.jl · GitHub and any future linked issues/PRs there.

4 Likes

By default AMDGPU.jl will install and use its own ROCm 5.3 stack (which is compatible with Julia 1.9), so you don’t even need to install it manually. You can verify it with:
AMDGPU.versioninfo()

And I suspect that you need to launch Julia with HSA_OVERRIDE_GFX_VERSION=10.3.0 env variable, like so:
HSA_OVERRIDE_GFX_VERSION=10.3.0 julia --threads=auto --project=.'
Because ROCm for Navi 2 only supports Navi 21 (gfx1030), but yours is Navi 23 (gfx1032).
That said all Navi 2 have identical ISA, so you shouldn’t notice a difference (I guess).


And as @ToucheSir said, AMDGPU support for Flux is being actively developed and I hope it won’t take long.

3 Likes

Indeed you are correct:

  • using AMDGPU; AMDGPU.versioninfo() does indeed detect my GPU and CPU correctly, though I can’t know for sure whether it’s using its own ROCm stack or the one I installed (5.4).
  • Using the HSA override HSA_OVERRIDE_GFX_VERSION=10.3.0 is necessary and works perfectly fine even though my GPU is gfx1032. FWIW, I also use this for PyTorch.

I have yet to try training a network on my card (and if I understand this correctly it shouldn’t be too dissimilar to PyTorch’s approach for ROCm), but I have to say I am quite amazed at how quickly not only AMDGPU.jl progressed, but also its integration with Flux thanks to the hard work and hours put in by you kind people. I don’t think it’s that far off either; I look forward to testing a lot of the upcoming releases and helping out where I can.

2 Likes

AMDGPU.versioninfo probably could stand to show if artifacts (JLLs) are being used or the system; hence versioninfo: Indicate if using JLLs or System by jpsamaroo · Pull Request #381 · JuliaGPU/AMDGPU.jl · GitHub. If you see paths that point to /home/myusername/.julia/artifacts/..., you’re using JLL-provided ROCm artifacts; otherwise, you’re using a system install of ROCm.

Once AMDGPU support is merged into NNlib, we’ll need to also look at some way of moving Flux’s models to the GPU (which is trivial to implement, we just need a function to do it).

1 Like

Apologies for the very late reply; yes, it seems that I’m using JLL provided artefacts for AMDGPU since all of the paths for (most) ROCm requirements point to ~/.julia/artifacts/....

I say some because some of them also report as missing (rocSOLVER, rocALUTION, rocFFT and MIOpen), and from what I’ve seen across the AMDGPU and NNLib repos, I understand that these are the ones that are currently being worked on, so in short - everything is working as expected!

2 Likes

MIOpen is available as an artifact, but is not installed by default.
You can ]add MIOpen_jll to install it, but you probably need to ]dev AMDGPU and execute that command from .julia/dev/AMDGPU directory to trigger the detection (for now).


You can track this PR which will add initial support for AMDGPU in Flux.jl.

3 Likes

Oops, I seem to have missed this earlier comment of yours explaining the same thing. As an aside, it’s awesome to see the recent PR under Flux’s repo directly, and also to see that the changes in pull #470 have been merged! Just fantastic!