I want to conditionally load an external module in my package, without needing the user to load that module.
Normal use of package extensions are not practical for my use case because I don’t want downstream users to need to install and import the external module themselves. I want my package to install and precompile the dependency for them, but only load it when absolutely necessary.
I have been using @JohnnyChen94’s LazyModules.jl which is great. However, to avoid world age issues, this requires the use of Base.invokelatest
on any call to the module, which incurs a small overhead and prevents type inference.
Is there any way I can avoid this overhead, and still lazily load modules from within a package? Or, equivalently, is there a way I can trigger an extension to load from within a function inside my package?
Other notes:
Since I don’t expect the function to change after the first call of Base.invokelatest
, perhaps there is a way to tell the compiler it is free to inline the latest call? I tried recording the world age manually, and then using Base.invoke_in_world
, as well as Core._call_in_world
but this sadly did not seem to improve things.
Perhaps this is not possible with Julia?
Attempt:
Here is my current attempt at lazily loading Zygote.jl to use it for computing gradient operators (but only if this section of the code has been triggered). I do not want to always load Zygote, and I do not want for users to have to load it themselves.
const ZygoteLoaded = Ref(false)
const ZygoteLock = Threads.SpinLock()
const ZygoteWorld = Ref(UInt64(0))
function load_zygote()
ZygoteLoaded.x && return nothing
lock(ZygoteLock) do
ZygoteLoaded.x && return nothing
@eval import Zygote: gradient
ZygoteLoaded.x = true
ZygoteWorld.x = Base.get_world_counter()
return nothing
end
end
function generate_diff_operators(operators)
load_zygote()
diffs = Function[]
for op in operators
diff_op(x, y) = Core._call_in_world(ZygoteWorld.x, gradient, op, x, y)
push!(diffs, diff_op)
end
return diffs
end
This works, but it still seems to incur an overhead in the Core._call_in_world
command, presumably because it can’t actually specialize to the particular definition of gradient
. (Base.invokelatest
is probably equivalent; I just wanted to see if specifying the exact world would helpthe compiler specialize)
For example:
julia> diff_ops = generate_diff_operators([+, -, *]);
julia> @btime diff_ops[1](1.2, 3.2)
1.146 μs (28 allocations: 672 bytes)
(Which is much slower than if I were to load Zygote at the top-level and use it normally)