Lazy global pyimports

In my packages, I’d like to import Python packages (currently with PyCall, but happy to switch if theres a PythonCall solution) as global variables and use these in my functions, but for performance reasons, I want to delay loading the Python package and even loading PyCall until these are actually used. Basically something like:

module Foo

__init__() = global np = lazy_pyimport("numpy")

function foo()
    np.random.default_rng().standard_normal(1)
end

# neither PyCall nor numpy should be loaded until foo() is called

end

Its possible to hack together something which works on simple cases using various @evals and Base.invokelatests (see here) but my solution is hacky and brittle, e.g. it doesn’t work on the above case where there’s multiple getpropertys and function calls involved. The fundamental problem is world-age, since it requires an @eval using PyCall.

This seems hard to do, nevertheless I feel like there’s a smarter solution than what I have. Does anyone have any ideas or does anything exist like this? Thanks.

With either package already loaded it is pretty straightforward to initialize python objects lazily.

To load the package itself lazily, you could see what Plots or MLJ do, they both load dependent packages on the fly.

1 Like

Thanks for the suggestion, I looked at Plots and in the context of my example above I think what Plots is basically doing is:

module Foo

using Requires

function init_numpy() # this is like backend(:pyplot)
    @eval using PyCall
    Base.invokelatest() do
        global np = pyimport(:numpy)
    end
end

foo() = Base.invokelatest(_foo) # this is like plot(...)

@init @require PyCall="438e738f-606a-5dbb-bf0a-cddfbfd45ab0" function _foo()
    np.random.default_rng().standard_normal(1)
end

end

although perhaps a Plots.jl expert can confirm since it was a little tough to follow.

I think this works well for Plots because it has a natural “single” entrypoint plot which can be invokelatest’ed, but in my case I don’t have this, and as you can see its much wordier (plus requires a manual call to init_numpy()). So I’m still curious if there’s a better more clever solution.

1 Like

Why do you want to delay importing PyCall? Do you expect that many sessions will never need it?

Yea, particularly in large distributed jobs with 100s of workers, none of which need PyCall, just the master process, and where I find that often dlopening Python libraries seems to screw up various Julia libraries (although I admitedly dont have a MWE for this, just something I’ve tended to run into over time).

What if you just require the user to call some initialization method from top-level if they want to use np stuff? And that function can do the eval and pyimporting. Then you shouldn’t need any invokelatests outside of calls in that function.

Do I take it that you have a package that you’re using on all nodes? You could split the package in two, a master package and a worker package, and only the master package uses PyCall.

1 Like