I want to create julia bindings to onnxruntime, see ONNXRunTime.jl. Things already work on my local machine CPU+GPU. However, I don’t quite know how to package it so GPU also works anywhere. Here is what I did so far and things I am uncertain about.
Maintainers of onnxruntime already provide pre build binaries for CPU and GPU. I added an Artifacts.toml file that points to these binaries and for CPU things work.
For GPU the onnxruntime binaries expect a lot of shared CUDA libraries such as libcudart.so.11.0, libcublasLt.so.11, .... If these are available in the $PATH (like on my local machine) things work. However this is a hassle to setup, so I would prefer to reuse cuda binaries from the CUDA.jl project.
There are two problems I have with this:
How to ensure the CUDA.jl binaries are installed automatically? I don’t want to depend on the CUDA.jl package, I just want to share artifacts. Should I copy paste CUDA.jl/Artifacts.toml for this?
How to make sure libonnxruntime finds the cuda binaries? Should I edit ENV["LD_LIBRARY_PATH"]?
We don’t have a working end-to-end solution yet; the difficult part is that the CUDA version depends on your driver’s capabilities, but the Artifact stack is ignorant of that so it cannot know which specific library to load. However, if you want to perform the loading of onnxruntime manually (i.e. picking a compatible artifact based on CUDA.toolkit_version())), you can make sure the required CUDA libraries are available by calling their respective functions in the CUDA module:
CUDA dlopens libraries when they are needed, similar to how most JLLs works, but lazily (again, as it needs to inspect your driver to know which library to load).
CUDA.jl doesn’t use libcudart, so there’s no function to load it. I guess we could add it though, since it is already packaged as part of the artifact. But not all CUDA libraries are, so are there other libraries you need but are missing (except for libcudart)?