Math support for GPU kernels

I was hoping you could provide some insight into the process of incorporating support for mathematical functions (such as sin, cos, sqrt, etc.) in GPU kernels.

Within the ROCm installation, there exists a file located at ‘/opt/rocm-6.0.0/llvm/lib/libomptarget-old-amdgpu-gfx90a.bc’, which contains mathematical functions. For AMDGPU support in Julia, is it customary to link this file to the IR module?

My curiosity stems from the presence of the OpenLibm project (openlibm.org). However, I presume that this project is not utilized for mathematical support within GPU kernels, correct?

Yes, we currently always link device libraries when compiling GPU kernels.
You can also see where we define support for math functions here.

Actually I haven’t seen openlibm, but we are using device libraries for other things so it makes sense to use them.

Thank you for providing clarification. Could you please direct me to the specific code where the reading of the device library occurs? I attempted to search within AMDGPU.jl but was unable to locate where a file such as ‘libomptarget’ would be accessed. Alternatively, do you package an intermediate representation (IR) version of the libdevice and forego the need to locate the device library at runtime via ROCM_PATH?

A bit of background: Within my library qdp-jit, I dynamically generate GPU kernels using LLVM. Kernels incorporating mathematical functions are linked to the libdevice bundled with the SDK (CUDA/ROCm). While this approach functions smoothly for CUDA, an issue has arisen with ROCm since version 5.5.1, where the bundled device library ceases to function properly. I am currently exploring how colleagues address this challenge.

Reading of device libraries occurs here, which is called during one of the compilation stages.

We do locate ROCm libraries during runtime, however, there are some exceptions.
At some point we were able to ship almost all of the ROCm stack as artifacts via Binary Builder.
Meaning the users wouldn’t need to install ROCm manually and Julia’s package manager would handle the installation.
However, it is too bothersome to maintain build recipes so for now we again require users to manually install ROCm.

I don’t recall having issues with 5.5 device libraries, but we had an issue where newer ROCm versions started using LLVM 16+, but Julia was still at LLVM 15 and we couldn’t read newer bitcode with older LLVM.
To fix that we’ve downgraded bitcode files to an older version of LLVM and are shipping them via BinaryBuilder as artifacts.