In my packages, I’d like to import Python packages (currently with PyCall, but happy to switch if theres a PythonCall solution) as global variables and use these in my functions, but for performance reasons, I want to delay loading the Python package and even loading PyCall until these are actually used. Basically something like:
module Foo
__init__() = global np = lazy_pyimport("numpy")
function foo()
np.random.default_rng().standard_normal(1)
end
# neither PyCall nor numpy should be loaded until foo() is called
end
Its possible to hack together something which works on simple cases using various @eval
s and Base.invokelatest
s (see here) but my solution is hacky and brittle, e.g. it doesn’t work on the above case where there’s multiple getproperty
s and function calls involved. The fundamental problem is world-age, since it requires an @eval using PyCall
.
This seems hard to do, nevertheless I feel like there’s a smarter solution than what I have. Does anyone have any ideas or does anything exist like this? Thanks.
With either package already loaded it is pretty straightforward to initialize python objects lazily.
To load the package itself lazily, you could see what Plots or MLJ do, they both load dependent packages on the fly.
1 Like
Thanks for the suggestion, I looked at Plots and in the context of my example above I think what Plots is basically doing is:
module Foo
using Requires
function init_numpy() # this is like backend(:pyplot)
@eval using PyCall
Base.invokelatest() do
global np = pyimport(:numpy)
end
end
foo() = Base.invokelatest(_foo) # this is like plot(...)
@init @require PyCall="438e738f-606a-5dbb-bf0a-cddfbfd45ab0" function _foo()
np.random.default_rng().standard_normal(1)
end
end
although perhaps a Plots.jl expert can confirm since it was a little tough to follow.
I think this works well for Plots because it has a natural “single” entrypoint plot
which can be invokelatest
’ed, but in my case I don’t have this, and as you can see its much wordier (plus requires a manual call to init_numpy()
). So I’m still curious if there’s a better more clever solution.
1 Like
Why do you want to delay importing PyCall? Do you expect that many sessions will never need it?
Yea, particularly in large distributed jobs with 100s of workers, none of which need PyCall, just the master process, and where I find that often dlopen
ing Python libraries seems to screw up various Julia libraries (although I admitedly dont have a MWE for this, just something I’ve tended to run into over time).
What if you just require the user to call some initialization method from top-level if they want to use np
stuff? And that function can do the eval
and pyimport
ing. Then you shouldn’t need any invokelatest
s outside of calls in that function.
Do I take it that you have a package that you’re using on all nodes? You could split the package in two, a master package and a worker package, and only the master package uses PyCall.
1 Like