PythonCall & JuliaCall
I’m very pleased to finally announce these packages on here. They have existed for quite some time, but I’m now happy to encourage more users to try them out.
PythonCall is a Julia package to interoperate with Python, so for example you can do:
using PythonCall
plt = pyimport("matplotlib.pyplot")
plt.plot(randn(100))
It can be installed with pkg> add PythonCall
.
JuliaCall is a corresponding Python package to interoperate with Julia, so for example:
from juliacall import Main as jl
import numpy as np
jl.seval("using Plots")
jl.plot(np.random.randn(100))
It can be installed with pip install juliacall
.
In what follows, I’ll introduce some of the cool features of these packages. Most of you will be aware of the similar packages PyCall and PyJulia, which have existed much longer, so there will inevitably be some comparison with these.
Extensible multimedia support
PythonCall knows how to display anything that IPython knows how to display (via _repr_mimebundle_
and friends), and PyCall can too. Unlike PyCall, it can also display Matplotlib figures, and what’s more you can add more display rules.
Hence in Pluto you can do
plt.plot(rand(100))
plt.gcf()
and the plot (returned by gcf
) is displayed.
On the other side, JuliaCall knows how to display Julia objects in IPython.
Flexible and extensible conversions
PythonCall has a very flexible function pyconvert(T, x)
which converts the Python object x
to a Julia object of type T
. Unlike PyCall (which similarly has convert(T, x)
) this function can take in to account both the Python type(x)
and the Julia type T
, which means a richer set of conversions are possible.
For example, in PyCall if you call convert(Vector{UInt8}, x)
where x
is a Python list
of int
, it will fail because it is expecting a bytes
. In PythonCall pyconvert(Vector{UInt8}, x)
will succeed either way: it has rules for list
and bytes
and selects the most specific conversion rule applicable to the inputs.
You can even do something like pyconvert(Vector{<:Real}, x)
and automatically get back a Vector{Int}
or Vector{Float64}
or whatever is appropriate for the items in x
.
This system is extensible, so packages can add more rules for different (T, type(x))
pairs.
Predictable syntax
PyCall has some behaviours making it hard to predict how it behaves.
For example x[0]
does not do what you think! Firstly, it gives a deprecation warning because get(x, 0)
is the proper PyCall syntax for indexing. Secondly, it actually gets the item at index -1
, which is supposed to be a convenience to compensate for Python indexing being 0-up and Julia being 1-up. But if for example x
is a dict
then this is not what you want (I guess that’s why it is deprecated).
In PythonCall, x[0]
just gets the item at index 0
no matter what x
is.
For another example, PyCall eagerly converts results to Julia objects. This means that sys.path.append("/some/path")
will not work because sys.path
is immediately converted to a Vector
. You might try push!(sys.path, "/some/path")
but since the Vector
is a copy of sys.path
it does not actually mutate the original sys.path
. To overcome this, PyCall has the syntax sys."path"."append"("/some/path")
to prevent this eager conversion.
In PythonCall, sys.path.append("/some/path")
does exactly what you intend. This is because most operations on Python objects return Python objects instead of converting them. If you actually need to convert anything to Julia you can use pyconvert
.
As a side-effect, operations in PythonCall are type-inferrable (they mostly return Py
) whereas operations in PyCall are not (they can return anything) so PyCall code can be type-unstable if you are not careful.
Non-copying conversions
By default, any mutable objects passed between Python and Julia are converted without copying any data - that is they lazily wrap the original object. This makes conversion super fast for large containers, and means that if the converted container is mutated, then the changes also appear on the original object.
A small number of immutable types (such as booleans, numbers, strings and tuples) are converted to the native types, e.g. a Julia Int64
becomes a Python int
.
In the Julia-to-Python direction, Julia objects are wrapped as a juliacall.AnyValue
. Some objects are wrapped to a subtype of this. For example any AbstractVector
is wrapped as a juliacall.VectorValue
which satisfies the sequence interface and behaves pretty much like a list
:
x = [1,2,3] # a Julia vector
y = Py(x) # wrap as a juliacall.VectorValue
y.append(4) # mutating y also mutates x
println(x) # [1, 2, 3, 4]
If you actually want a list you can do pylist([1,2,3])
.
In the Python-to-Julia direction, mutable Python objects are typically left as Python objects. Again, some objects are wrapped differently:
x = pylist([1,2,3]) # a Python list
y = pyconvert(AbstractArray, x) # wrap as a PyList{Int}
push!(y, 4) # mutating y also mutates x
println(x) # [1, 2, 3, 4]
Array conversion
Particularly of note is that if x
is a strided Julia array then it will be wrapped in Python to a juliacall.ArrayValue
which satisfies the buffer protocol and Numpy array interface. This means that a Vector{UInt8}
can be passed to any function expecting a bytes
-like object, and a Vector{Float64}
can be passed to any function expecting a Numpy-array-like object. In particular numpy.array(x)
will convert it to an actual Numpy array.
In the other direction, if x
is a bytes
or numpy.ndarray
(or anything satisfying the buffer protocol or array interface) then PyArray(x)
gives an AbstractArray
view of the data.
Tabular data
If x
is a Julia table (in the Tables.jl sense) then pytable(x)
will convert it to a Pandas dataframe. You can ask for other output formats, such as dict
of list
.
If x
is a Python table (for now only Pandas dataframes are supported) then PyTable(x)
wraps it as a Julia table.
Isolated dependencies
All the Python dependencies for PythonCall are (by default) managed by the CondaPkg which I have announced separately.
If your project needs Numpy you can simply do
pkg> conda add numpy
before loading PythonCall. Then a Conda environment is created containing Numpy. This environment is specific to your Julia project, so dependencies are totally isolated between projects.
This also creates a CondaPkg.toml
file recording the dependencies (analogous to Project.toml
) so if you save it to your package, then any users of the package also get these dependencies installed.
JuliaCall similarly uses a new package JuliaPkg to manage its dependencies. If you are using a Python virtual environment or Conda environment, then a Julia project specific to that is used, again keeping dependencies totally isolated.
In all cases, Python or Julia are automatically installed if needed, meaning that packages depending on PythonCall or JuliaCall can be used with zero set-up. They are installed to an environment-specific location, so that removing the environment also removes any dependencies.
Use different Pythons without rebuilding
PyCall currently hard-codes the path to libpython in its build step. This means that if you need to use different versions of Python, then you need to rebuild PyCall each time you switch.
PythonCall has no build step. You can start multiple Julia sessions in multiple projects each requiring a different version of Python, and PythonCall will work fine in all of them.
JuliaCall is teeny
It pretty much consists of this one file just 137 lines long. This is because most of the implementation is in PythonCall, and all JuliaCall needs to do is find Julia and get it to import PythonCall.
The relevance of this is that you get a very consistent experience between PythonCall and JuliaCall. All the conversions work the same in both directions from either package. Since JuliaCall is bundled into PythonCall, any Python package can do import juliacall
and it will work properly regardless of whether it is running in Python itself or from PythonCall in Julia.