Embedding Julia in the Java Virtual Machine

Is it easier to embed Julia into the Java Virtual Machine (JVM) or easier to embed the JVM into Julia?

Embedding the JVM into Julia

I’ve been working with @avik on JavaCall.jl for the better part of this year. To make JavaCall work, there have been several hoops to jump through, notably the JULIA_COPY_STACKS environmental variable that modifies Julia’s Task implementation. On Linux, I’ve found setting JULIA_COPY_STACKS=1 works quite smoothly. However, this need is also inconsistent on the Windows operating system, which doesn’t appear to need JULIA_COPY_STACKS=1 for single threaded operation. The main difficulty is the JVM’s dependence on using signal handling for basic operations. How Julia’s Tasks interact with Java signal handling due to stack semantics creates an undesirable situation. The JVM is a bit challenging to embed, and Julia’s Task semantics make it even trickier.

Embedding Julia into the JVM

Embedding Julia in Java has been tried a number of times with varying success. For example, @jbytecode’s JuliaCaller connects Julia via TCP, which is great clean solution, but sharing memory becomes difficult:

@rssdev10’s Julia4J uses SWIG to build a Java Native Interface (JNI) integration with Julia which promises efficiency, but this requires a compiler on the user end for a specific JVM:

@cnuernber recently has built Java Native Access bindings from Clojure (a LISP dialect that runs on JVM) to Julia, meaning that libjulia can now be dynamically linked into the JVM. No compiler needed.
Using Julia from the JVM (Clojure)

While there were some early hiccups, it turns out that Julia has enough options to allow it to play nice with others as we found in Consistent crash attempting to embed Julia in Java via dynamically loading libjulia.so · Issue #36092 · JuliaLang/julia · GitHub . They key in this instance was turning off Julia’s signal handling:

jl_options.handle_signals = JL_OPTIONS_HANDLE_SIGNALS_OFF

Based upon libjulia-clj’s cross platform success based on Travis tests, this path is looking quite promising.

Once linked together, Julia should then be able to leverage JavaCall to call back into the JVM. JavaCall, for example, has an alternate initialization routine in JavaCall.JNI.init_current_vm

I don’t mean to disparage the prior methods. Each has it’s own advantages and disadvantages. Network based integration via TCP should be quite robust and scalable. Java Native Interface comes with a low overhead. The Java Native Access approach is user friendly middle ground.

Overall, the approach of embedding Julia into the Java Virtual Machine using Java Native Access looks quite promising. This bodes well for future cooperation between Julia, Java, and other JVM-based languages like Clojure, Scala, and Kotlin. Embedding Julia into the JVM appears to be the easier path forward.

7 Likes

great summary, thank you!

1 Like

Hi, actually, it doesn’t require a compiler on a user side. A dynamic library which is built with libjulia, swig generated code and at least some compatible JDK is requied. If there is a set of these OS specifics libraries embedded into a jar file, nothing except that jar is requred for a final user. But to be able to prepare it, we need to have a building agents for at least 3 operating systems (Linux, Win, Mac for AMD64) with some actual Julia, OpenJDK, Oracle JDK. So, I prepared recomendations how to build it locally, but not able to provide a common build.

At the same time, what I tryed to implement - a wrapper which is compatible with jsr233 specification. And the purpose is not to have a way to activate Julia from Java only, but rather to have a common interface which might be reused from anywhere. Previously, I did integration of JRuby into KNIME - GitHub - rssdev10/ruby4knime: Ruby scripting extension for knime.org. The KNIME specifics is - every node has own execution container, and they can be active in parallel. The last aspect might be an issue if we have only one back VM interface from Julia to Java.

2 Likes

We’re getting into semantics here about what a “user” is. At the moment though, someone would still need to to use your build instructions in order to use those bindings though. If there are not prebuilt binaries, then then builder might have to be the user.

I wonder if https://binarybuilder.org/ could provide those build agents if we target specifically OpenJDK 8.

I think, most of MacOS and Linux PCs are with OpenJDK. But not 100% that a binary based on OpenJDK 8 is compatible with OpenJDK 11. That part it would be good to check.
Windows users are with Oracle Java SE. Regarding my code, if I remember right, I automated it with CMAKE on Mac, but didn’t check it properly on Linux systems.

Regarding Julia use cases, the main question for me - is it possible to run these Java containers in parallel inside the same Java VM. And, also, be sure in data isolation and crash safety.

What do you make of “basic” process to process communication (as in starting an external Julia process from Java and discussing with it via process input/output streams)?
I was poking around with this idea and put some code together to do that (and support scripting via JSR223) with decent performance before I discovered the solutions you mentioned (and then realized someone had recently gone that way already here)).
Any obvious caveat for this solution? There’s clearly no memory sharing (same goes for the socket-based solution) but this is minimally impeding depending on the use case. The rest of it is rather straightforward and only requires the presence of Julia on the host, no binding or wrappers involved.

In general, I think shared memory linking is probably a no-go since Java generally doesn’t play nicely with others. My main hesitancy with any Julia java interop is that it involves java, but I understand that is sometimes necessary

1 Like

I’d rather rephrase (and agree) that interop between JVM and other (say here LLVM) languages is complicated indeed, arguably much tougher than it is between languages within the same family/VM.

I’ve been watching at GraalVM for some time now, and their proof-of-concept interop with python looks pretty decent (down to memory sharing). I’m probably going to try fiddling with that as well and see if that would play nicely with Julia.

PyCall.jl for python?.. why VM

For a command-line interface, see
A Java Matlab/Julia Interface - Stack Overflow

One stable memory API is is the java.nio.Buffer API:
https://docs.oracle.com/javase/7/docs/api/java/nio/Buffer.html
It has a corresponding Java Native Interface support:
https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#nio_support

The above is mainly limited to 31-bit addressing (2 GB)…

There are several incubating Java APIs in jdk.incubator.foreign, part of Project Panama, that improve this situation greatly.

How about sulong?

https://github.com/oracle/graal/tree/master/sulong