Pyjulia, Django, and segfaults

I work at a very small (3 devs) startup. Our backend is Python / Django. We have some computation-intensive code for demand estimation which we’ve written in Julia, and we want to integrate that code into our Django backend and wrap/expose it via endpoint.

So, in our Django project we have an app demand_estimation which lives alongside the usual suspects of db, util, docs, etc. Inside the demand_estimation directory there is:

  • DemandEstimation.jl, which defines module DemandEstimation and contains the core logic;
  • demand_estimation.py, which wraps the Julia-side core logic with pyjulia (and defines endpoints);
  • __init__.py, which looks like:
# set up Julia sans precompilation (necessary for conda python)
from julia.api import Julia

Julia(compiled_modules=False)
# load julia demand estimation code
from julia import Pkg

Pkg.activate(".")  # django backend environment
from julia import Main

Main.include("demand_estimation/DemandEstimation.jl")

from julia.Main import DemandEstimation as DE  # noqa

Then in demand_estimation.py we have from . import DE and make calls like DE.important_function(args). This works well…in tests.

However, when we actually try to run a debug/test/development server with Django’s runserver command, there are segfaults left and right. In particular, we’ve traced the segfaults to the following code in demand_estimation.py:

from . import DE # DE is DemandEstimation julia module defined in the __init__.py

def call_something_in_julia(arg):
    func_to_call = DE.important_func # this line causes a segfault
    return func_to_call(arg)

To reiterate, a Django test that hits call_something_in_julia will pass without error; hitting a running server’s endpoint that in turn calls call_something_in_julia causes a segfault.

Any idea on what might be causing this segfault, how to mitigate it, and/or why it happens only when running a server and not in tests? We’re pretty much at wit’s end and considering just abandoning the backend Julia, which is a shame for lots of reasons, not least because I greatly prefer writing Julia over Python!

Responding to my own post in case someone searches with a similar problem. It seems like we are running into trouble because libjulia is not thread-safe. Tentatively, running the Django test server with --nothreading --noreload eliminates the segfaults. Presumably the Django tests are single threaded.

5 Likes

Hi, we have a similar app set-up and are also seeing nebulous (periodic) segfaults: did you figure out how to avoid them in production? (We are using gunicorn in production).

The switch to --nothreading seemed to fix the segfaults at the time. Shortly after this thread we switched to running the Julia code in its own microservice, so I’m not up to date on Django + Gunicord + Julia-python compatibility. Sorry!

Ok thank you @evanfields! This thread did get me on the right track: our issue seems to be also that PyJulia is not thread safe (and we are using Celery with Django to run some tasks in parallel that use PyJulia).

Of course it depends on your architecture, but in the world of micorservices it could make sense to implement important julia function as a standalone server and communicate with it internally. This way you will be free of direct python/julia interactions.

Just speculating, but would it help to use locks on Python-side so that only one thread at a time calls Julia?

@Skoffer Yes I think that setting up our Julia processes as a microservice is the way to go. We are deploying in Kubernetes, so I guess the way to go would be to set up an API in a Julia container for the python container to call?

@lungben Yes I think that approach will also work. I’m thinking of multi-threading on the Julia side of our app instead of on the python side as we are now.

Yes, you can go with Pages.jl or with vanilla HTTP.jl. For a simple internal API it wouldn’t make much difference I suppose.