RustCall.jl?

In order for something like PyO3 to be built for Julia, is there anything missing in Julia itself?

Let’s be more specific. What functionality of pyO3 are you looking for? Where does jlrs fall short?

I‘m not sure. I‘m writing from the perspective of someone who would maybe like to wrap some Rust crates in Julia, but has no previous experience in this task. I keep seeing PyO3 mentioned as the reason why many crates are already wrapped in Python, and similar tools not being available in Julia as the reason they are not in Julia. It could well be that jlrs is already almost there, we just need to use it more. I will probably only find out once I go ahead and try it myself.

The ability to pass arrow objects to and from jlrs would be helpful, similar to arrow::pyarrow. It would require Arrow.jl to implement the c ffi interface however.

This is certainly a dumb question – but I thought the whole point of arrow was that it was common serialized format so that tables could be passed across processes without a copy. Shouldn’t you just be able to reinterpret the underlying data as a Vector{UInt8} and pass it to Rust where it could be re-reinterpreted as an arrow table?

2 Likes

I think in other languages you would pass a reference to an ArrowArrayStreamReader because you might not have the whole dataset in memory. It would use the arrow ffi interface. Unless there is some other way to do this in Julia that I’m not aware of.

In PyO3, you have proc macros + access to the CPython API and it makes it easy to interface your domain specific Rust code into a Python module entirely in Rust. e.g. the following Rust code:

use pyo3::prelude::*;

/// Formats the sum of two numbers as string.
#[pyfunction]
fn sum_as_string(a: usize, b: usize) -> PyResult<String> {
    Ok((a + b).to_string())
}

/// A Python module implemented in Rust. The name of this function must match
/// the `lib.name` setting in the `Cargo.toml`, else Python will not be able to
/// import the module.
#[pymodule]
fn string_sum(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(sum_as_string, m)?)?;
    Ok(())
}

will end up as a Python module that can be imported like so:

$ python
>>> import string_sum
>>> string_sum.sum_as_string(5, 20)
'25'

In jlrs, this is done with ccall:

use jlrs::prelude::*;

// This function will be provided to Julia as a pointer, so its name can be mangled.
unsafe extern "C" fn call_me(arg: bool) -> isize {
    if arg {
        1
    } else {
        -1
    }
}

let mut frame = StackFrame::new();
let mut julia = unsafe { RuntimeBuilder::new().start().unwrap() };
let mut julia = julia.instance(&mut frame);

julia
    .scope(|mut frame| unsafe {
        // Cast the function to a void pointer
        let call_me_val = Value::new(&mut frame, call_me as *mut std::ffi::c_void);

        // Value::eval_string can be used to create new functions.
        let func = Value::eval_string(
            &mut frame,
            "myfunc(callme::Ptr{Cvoid})::Int = ccall(callme, Int, (Bool,), true)",
        )
        .into_jlrs_result()?;

        // Call the function and unbox the result.
        let result = func
            .call1(&mut frame, call_me_val)
            .into_jlrs_result()?
            .unbox::<isize>()?;

        assert_eq!(result, 1);

        Ok(())
    })
    .unwrap();

There is a macro that abstracts it:

julia_module! {
    become callme_init_fn;
    fn call_me(arg: bool) -> isize;
}

but my understanding is that you are still effectively using ccall.

I’ve only read the docs and it seems like there’s a julia_module macro that abstracts a lot of things: https://docs.rs/jlrs/latest/jlrs/prelude/macro.julia_module.html

I may have to actually use it on a project to see how it works in practice. And reading through the docs again after a while, it seems like there’s an impressive amount of work put in to the jlrs project from @Taaitaaiger. I’m eager to actually try this in the future on a project.

wrt potential issues, from the docs:

It can be rather tricky to figure out how data is passed from Julia to Rust when ccalling a function written in Rust. Primitive and isbits types are passed by value, managed types provided directly by jlrs are guaranteed to be boxed, all other types might be passed by value or be boxed.

In order to avoid figuring out how such data is passed you can work with TypedValue, which ensures the data is boxed by using Any in the signature of the generated ccall invocation, but restricts the type of the data in the generated function to the type constructed from the TypedValue’s type parameter.

And from @Taaitaaiger’s comment on reddit:

The challenges arise when you start dealing with more complex types. There’s nothing inherently wrong with calling ccall with an array argument, for example, but if you want to access it you’ll need to know its layout to access information like its rank, element type, and the elements themselves. There’s also quite a bit of manual work involved, particularly writing a function in Julia for each function that you want to expose.

My understanding is that Julia doesn’t expose the necessary functionality via a C ABI and hence you have to use ccall in Julia to access features only exposed in “Julia land”. This isn’t the case in CPython.

So it’s not really a functionality of pyo3 that’s missing in jlrs, but rather functionality of CPython that is missing in Julia.

3 Likes

The challenges arise when you start dealing with more complex types. There’s nothing inherently wrong with calling ccall with an array argument, for example, but if you want to access it you’ll need to know its layout to access information like its rank, element type, and the elements themselves.
[/quote]

I think this is what supporting the FromJulia and IntoJulia trait for arrow objects would help with. It would provide the schema to describe the complex data structures without having to maintain that support in jlrs. The jlrs package would only have to copy the schema via the c interface to arrow.jl and maintain ownership of the pointer to the underlying struct data between Julia and rust.

3 Likes

My understanding is that Julia doesn’t expose the necessary functionality via a C ABI and hence you have to use ccall in Julia to access features only exposed in “Julia land”. This isn’t the case in CPython.

It’s not just that it isn’t exposed, Julia’s C API is unstable so no library that needs to call back into Julia can assume any kind of forward compatibility. It also only exposes what Julia needs with relatively little mind for external consumers. For example, there’s no equivalent for pyo3’s add_function as far as I’m aware. The julia_module macro’s generated init-function returns an array whose elements expose enough information to let Julia generate implementations of functions that ccall the exported functions.

TypedValue mostly exists to aid in this code generation process. For example, if an exported function uses a TypedValue as an argument type, that argument of the generated function is restricted to the inner type, while the ccall invocation uses Any. It was added because I noticed that many mutable types were passed by value to the ccalled function instead of by pointer.

1 Like

I’m intrigued! Feel free to reach out to discuss possibilities, but I have to admit I’m unfamiliar with arrow.jl

The C API for Julia Array is defined here in julia.h:

A few convenience elements are defined via C macros here:

There’s also a new API for generic memory being introduced for Julia 1.11:

I’m not clear why ccall is particularly problematic here. ccall when compiled is quite efficient. Looking at the RustFFT example, the Julia boilerplate seems quite minimal:

Most of the work is done in Rust:

2 Likes

So for me, there is one question of “How do we make it easy to make Julia packages that wrap Rust code?” And there is lots of good discussion here about that. It is a prerequisite to a second question:

“How do we allow an end user to directly call a Rust library they found without writing any .rs files in a similar way to how PythonCall allows me to just pull in a Python library and call its functions?”

I see two big blockers:

  1. Python has a runtime that can keep track of objects between calls. Rust doesn’t have a similar runtime that could idle between calls. Instead, we’d need each function call to consume something from Julia, do its work, and then return something for Julia to hold onto. This means you may need to serialize complex intermediate objects.
  2. With regard to syntax and type system richness/complexity the ordering is Python < Julia < Rust. I don’t know enough Rust to enumerate the edge cases, but writing Julia code that can be turned into method chains in Rust doesn’t seem trivial.

My basic experiments lead me to believe that, within the context of a single function, it might be able to transpile subset of Julia into Rust. It would have to start with a subset of types which have manual translations written.

Below is a bit of a silly example, but captures some issues:

@use simsearch::SimSearch
@rust_fn function fuzzy_search(haystack::Vector{String}, needle::String)::UInt32
    engine::SimSearch{UInt32} = SimSearch::new()
    i::Int = 1
    for s in haystack
        engine.insert(i, s)
        i += 1
    end
    engine.search(needle)
end
use simsearch::SimSearch;

pub fn fuzzy_search(haystack:Vec<&str>, needle:&str) -> UInt32 {
    let mut engine: SimSearch<u32> = SimSearch::new();
    let mut i = 1;
    for s in foo {
        engine.insert(i, s);
        i += 1;
    }

    engine.search(need)
}

I grabbed this example right from simsearch’s readme because I want to emulate what it would be like to be a Julian that doesn’t know Rust very well, but wants to grab a function to do fuzzy matching.

A lot of the basic syntax is almost copy and paste. Notable differences are

  1. Accessing elements of a module with :: is not Julian. I think it’s better to be more Rust-like so that grabbing from rust docs is easy… but I’m not certain.
  2. Marking first assignment with let and not others is not caught in Julia’s syntax.
  3. This doesn’t touch more complex things like .match or other Rust syntax that Julia doesn’t implement.

I’m not really sure what you mean by this. You do not need a runtime to have persistent data structures. In Rust you can create static variables.

If a Rust method returns a complicated type with internal pointers to other complicated types to Julia, is there a generic way to pass it back to a new Rust method and reinterpret it back to the same type?

The way I understand the function of PythonCall is that it is running an “interactive” Python process and just submitting calls into to that continuous process, so the data remains on the “python side” until you pass it back to Julia via a conversion. I naively thought that a structure as close to that as possible would be the easiest way to develop an all-Julia Rust management framework.

I think what you are describing for Julia is similar to what exists in R currently through rextendr::rust_function where you pass a string with the rust function, and it creates a crate in temporary directory with a generated source file and cargo.toml, compiles it to a library and then generates the R wrapper function to ‘ccall’ the library. It’s automating a similar process you could follow if you wanted to write a crate using julia_module!, which is then compiled into a library with extern c functions, and wrapped as a jll package.

1 Like

Could you get this with evcxr? GitHub - evcxr/evcxr — they have various frontends (a REPL, a Jupyter kernel), but the backend could be used here, as a sort of “Rust runtime.”

(evcxr = EValuation ConteXt for Rust)

It lets you dynamically declare dependencies too, just like PythonCall.jl.

1 Like