I’m in the scenario that I need to call Python 2 legacy code from Julia, but I also do all my plotting from PyPlot/matplotlib and hence want to use a modern Python 3 matplotlib, and I want to do this all from a single Julia session so I can do exploratory work in a notebook.
Are there any smart ways I can go about doing this? It seems like PyCall can only be linked with one Python version at a time in a given session (which seems totally reasonable). I have been using execnet to call Python 2 from Python 3, but its still pretty clunky. Any other suggestions? Thanks.
I wonder if you can use Distributed to create a second instance and configure it’s PyCall different than the master instance. Baring that I would probably create two Julia instances and have them communicate over TCP, i.e. a client/server model, where the client using 1 version of Python and the server uses the other.
Thanks, that’s a great idea! Moving objects around in Julia is much nicer than the clunkier execnet, so its an overall win.
A down-side is that PyCall has to be built/precompiled each time we launch, which adds ~10 seconds to startup. Certainly would be great if two built versions could be stored separately, but for now its a fine trade-off for me. Here’s a first attempt that basically works:
using Distributed
using PyCall
using Pkg
id_py2worker, = addprocs(1, restrict=true)
# launch our Python 2 worker and build PyCall with Python 2
@everywhere id_py2worker begin
ENV["PYTHON"] = "python2"
using Pkg
Pkg.build("PyCall")
using PyCall
end
# in background, rebuild PyCall back to the original version (the py2worker has already
# loaded Python 2, so that will stick)
remotecall((orig_python)->begin
ENV["PYTHON"] = orig_python
Pkg.build("PyCall")
end, id_py2worker, PyCall.python)
One problem is that if Revise is already loaded on the main process, building back the original Python will cause the Python 2 workers to update and in fact segfault. I can’t figure out how to stop that from happening. (This issue could be one solution)
It should be possible to clone a copy of PyCall, install it as a new package with a different name (e.g. PyCall3), and configure it with a different version of Python. Then you can import both PyCall and PyCall3 in the same Julia process.
Thanks. This would be even better, but is there any programmatic way to set something like this up? Or would anyone else using my code have to also do it by hand (which sounds not entirely trivial)?
The easiest thing is probably for you to post a fork of PyCall as “PyCall2” or whatever and tell your users to add it. The hardest thing to automate, of course, is the process of setting up Python itself. (The Conda package only lets you install either Python 2 or Python 3 at one time. Of course, you could create a Conda2 fork that defaults to Python 2, and make your PyCall2 fork depend on Conda2.)
I think you can also create a sysimage with Python 2 and pass it to Julia subprocess via --sysimage flag. This would handle the case where you need to use packages depending on PyCall configured with Python 2.
I don’t have to fork anything or edit PyCall’s source to rename anything.
Users don’t have to do anything beyond standard Julia package installation.
No unnecessary recompiles get triggered.
I think basically this achieves that. What it sets up for you is that you have the latest version of PyCall in your main environment built for Python 3, and in a separate environment it installs an older version of PyCall and builds it for Python 2. Then it spawns a subprocess Julia running in this other environment and communicates using remote calls. Thanks to pull/32651 (so you do need to be on master, for now), both versions can be precompiled so no recompilation is triggered as you run the two environments.
Not planning to register this for now, but happy if anyone uses / contributes / critiques this solution.