I am enjoying how fast & expressive it feels compared to some other languages I have used before. I still rely a lot on Python for certain libraries & workflows. I have been exploring ways to combine the two but I am not sure what the best practices are when it comes to performance, package management & keeping things clean when switching back & forth.
Is PyCall still the go-to approach for most people or are there newer, more efficient ways to integrate Python code into Julia projects?? Also I want to know how others organize their projects when both languages are involved — do you keep separate environments or mix them within the same workflow?
I came across a Ruby On Rails course that mentioned how people often bridge different tools in a single project & I want to know about how Julia folks think about this balance.
without comment on some of the bigger picture questions posed here, I think specifically with respect to Python interop, PythonCall.jl (and its paired Python package juliacall to allow calling Julia from Python) are probably better choices now than PyCall.jl
Is PyCall still the go-to approach for most people or are there newer
I would say no, PythonCall.jl is now typo not the mainstream, and the future. PyCall is still used by some python package wrappers, so you might end up using both at the same time even without knowing it. See the docs on if such is still a problem.
previously used PyCall.jl but switched to PythonCall.jl (on my recommendation).
You need to decide which is your main language to call from, if Python, you install the associated juliacall Python package and do something like:
from juliacall import Main as jl
That way you can e.g. use Django, or whatever Python uses for GUIs, even for SQL (Julia have GUI, maybe less developed, web and SQL capabilities too if you want Julia the main language) and call Julia for some stuff.
Note there similar for Ruby [on Rails]:
gem install jl4rb
I don’t know how good it is, or if in-process, as the other solutions for python, or if calling in both directions is then possible. I knew of this years ago, but rarely see Ruby discussed for interop with Julia if ever. It might still just work.
One thing of note, is that Python dicts, are by now ordered (what I and them consider convenient), Julia’s built-in Dict isn’t ordered (to be faster or not rule out such algorithms). Ordered is available in OrderedCollections.jl, and I’ve tried to get it to be the new default in Julia…
Anything you build in pure Python will be way slower (NumPy, numba etc. compensate, but not fully).
What do you have in mind, can you even provide a link to it?
I still use PyCall because I find it easier and some packages I use rely on it. It installs one Python version for all of your packages per Julia version in .julia/conda.
If you use PyPlot for plotting, be aware that this can cause problems when you are using multi-threading in Julia. Workaround: Do the plotting in a separate process using DistributedNext.jl .
There was a version conflict between Julia and PyPlot recently, which can be solved by executing:
using CondaPkg
CondaPkg.add("libstdcxx", version="<14.0")
That’s your preference, and is ok. PythonCall is easier to set up(?), PyCall not too hard, but not automatic. The latter some package rely on it isn’t an argument to not use PythonCall (too)?
You mean it’s easier after installation? The code you write, or is it just about what you’re used to? I really want to point people to the right solution, think PythonCall is the future, didn’t want to go into all the pros and cons of each.
One difference is that PythonCall maintains one Python environment per Julia project, PyCall one for all Julia projects of one major Julia version. What is better really depends on your specific needs.
Don’t use Python-Julia interop much so I’m not very good at it, but I prefer how PythonCall handles, or rather lets us handle, types linking both sides. That might be more opinionated though, I just found more automatic conversions more of a burden than a feature.
Using PythonCall to call from Julia into Python generally works as you would expect it even with multithreading on the Julia side. One thing to be aware of is the inverse does not work. Ie if you create threads on the Python side that call into Julia via JuliaCall, your program will hang. This is because you can currently only call into Julia from the Python thread that imported juliacall which called jl_init. See juliacall: Julia GC triggered from a python thread causes hangs · Issue #578 · JuliaPy/PythonCall.jl · GitHub for more info. Beyond that you can actually call back and forth between Python and Julia as long as you make sure to manage the GIL correctly.
For projects that started as Python and call into Julia, one other caveat is its rather difficult to deploy them together as a WSGI service in the same process. Both uwsgi and gunicorn interact with signals and threading in a way that doesn’t play nice with Julia which isn’t trivial to patch. It’s best to split Julia into its own process and communicate between Python and Julia using sockets and shared memory. ProtoBuf.jl and InterProcessCommunication.jl are the MVP here. If you are trying to deploy a Julia project that needs to call Python, presumably you don’t need to split Python into its own process and can just rely on PythonCall.
It’s interesting to hear that comment about multithreading. We tried to use PythonCall to use the Azure Python SDK to communicate to the Azure Service Bus from a Julia app. But when the app tried to send or receive messages from another thread (basically Threads.@spawn some function that calls into the SDK) there were segfaults. We couldn‘t figure out what the problem was so we put the PythonCall stuff in a separate process via the Distributed stdlib and sent the workloads there via RemoteChannel so the main process could run the web server while the other process could handle communication to the message bus. It was quite hard to maintain, especially since we needed a custom Python installation. Since then we have changed to a different solution using dapr. I‘d still love to hear how others handle use cases like this and what are some best practices. E.g. you have a multithreaded app and need to call into different Python libraries from different threads.
If you experience segfaults when using multithreaded Julia, those are probably benign signals used by Julia’s garbage collector in multithreaded mode, but they’re caught by python which doesn’t know what to do with them and crashes. One clue to verify this is indeed the case is to disable the garbage collector ( warning : don’t do this in production! Of course this can lead to very large memory leaks and crash the entire system!) and check if you still get the same segfaults under the same conditions, if not, then this is the culprit.
Benign segfault signals are used by the garbage collector to stop the world in multithreaded mode. Julia knows how to handle them, Python doesn’t unless someone tells it how to (and I don’t think anyone has done it).