In my packages, I’d like to import Python packages (currently with PyCall, but happy to switch if theres a PythonCall solution) as global variables and use these in my functions, but for performance reasons, I want to delay loading the Python package and even loading PyCall until these are actually used. Basically something like:
__init__() = global np = lazy_pyimport("numpy")
# neither PyCall nor numpy should be loaded until foo() is called
Its possible to hack together something which works on simple cases using various
Base.invokelatests (see here) but my solution is hacky and brittle, e.g. it doesn’t work on the above case where there’s multiple
getpropertys and function calls involved. The fundamental problem is world-age, since it requires an
@eval using PyCall.
This seems hard to do, nevertheless I feel like there’s a smarter solution than what I have. Does anyone have any ideas or does anything exist like this? Thanks.
With either package already loaded it is pretty straightforward to initialize python objects lazily.
To load the package itself lazily, you could see what Plots or MLJ do, they both load dependent packages on the fly.
Thanks for the suggestion, I looked at Plots and in the context of my example above I think what Plots is basically doing is:
function init_numpy() # this is like backend(:pyplot)
@eval using PyCall
global np = pyimport(:numpy)
foo() = Base.invokelatest(_foo) # this is like plot(...)
@init @require PyCall="438e738f-606a-5dbb-bf0a-cddfbfd45ab0" function _foo()
although perhaps a Plots.jl expert can confirm since it was a little tough to follow.
I think this works well for Plots because it has a natural “single” entrypoint
plot which can be
invokelatest'ed, but in my case I don’t have this, and as you can see its much wordier (plus requires a manual call to
init_numpy()). So I’m still curious if there’s a better more clever solution.
Why do you want to delay importing PyCall? Do you expect that many sessions will never need it?
Yea, particularly in large distributed jobs with 100s of workers, none of which need PyCall, just the master process, and where I find that often
dlopening Python libraries seems to screw up various Julia libraries (although I admitedly dont have a MWE for this, just something I’ve tended to run into over time).
What if you just require the user to call some initialization method from top-level if they want to use
np stuff? And that function can do the
pyimporting. Then you shouldn’t need any
invokelatests outside of calls in that function.
Do I take it that you have a package that you’re using on all nodes? You could split the package in two, a master package and a worker package, and only the master package uses PyCall.