Package for Rust-like borrow checker in Julia?

P.S., API question for y’all. Since in Julia, symbols can be reassigned to new objects regardless of the type or mutability, would it be better to refer to the immutables here as const?

Because in Julia if I write

const x = 1

then it perhaps has closer semantics to a rust

let x = 1

within the scope of a function (ignoring the fact that Julia doesn’t yet have local const), because I am not allowed to do

x += 1

Unless I do a new let.

So maybe instead of

@own x = 1
@own_mut x = 1

It would be better to have

@own const x = 1
@own x = 1

As more intuitive format for Julia people. And perhaps similar for @ref.

What do you think?

“Changing” a node in a tree usually only requires to rebuild the path to the node while sharing all sub-trees that are untouched. This only works, if sub-trees are never mutated as they can end up being shared by many “copies”. I.e., you might want to use them as persistent data structures as they are also used in Haskell or Clojure. When cleverly implemented, persistent data structures still have some overhead and sometimes worse memory locality then mutable data structures, but are much better than “copy on write” (as you hinted at above).
As a start, there seem to be some implementations in Julia, e.g., PureFun.jl

1 Like

Probably not. It’s very logical to recognize that an un-reassignable const x is the most direct analogue to immutable variables in Rust (and conversely, other variables are like mutable Rust variables), but again, Julia doesn’t have the variable and references model that Rust has. You’re already trying to use the immutability/mutability of Julia types to emulate Rust’s model, it will only further confuse people if you evoke properties of Julia variables. You wouldn’t appropriate let for Rust variable declarations when it already means something very different in Julia.

2 Likes

Not sure. If I put my Julia hat on, I kinda find @own const x = 1; @own x = 1 to be easier to understand in meaning than @own x = 1; @own_mut x = 1. When I see const, I have the right feeling that “this symbol can’t be reassigned”, even though there are some subtle differences.

Like, of course both aren’t full replicas (impossible, I am well aware). But isn’t it still better than the current version?

Basically what we really don’t want is for a user to think they can write

@own x = 1
x = x + 2

This actually works with the current API. The new x would just be a regular Int (we can’t override assignment in Julia). But we really really want to discourage it, since it breaks this simulated borrow checker.

Now maybe we’ll find a way to prevent that with macros or code escape analysis, but right now, all we can do is provide gentle guidance.

I feel like

@own const x = 1
x = x + 2

Is much better at sticking out as “illegal” code that a user would assume won’t work, so they wouldn’t try.

This sounds super cool. I have no idea what this would look like though. Maybe I just need some coffee.

Take my type, Node{T}: DynamicExpressions.jl/src/Node.jl at b3660ae13a1ec435656f14dfeb2d24c73f8c3958 · SymbolicML/DynamicExpressions.jl · GitHub

This represents a symbolic expression using a lookup table from integer → operator. Every node can be mutated (with trees inserted, deleted, fused, etc), which is needed to support SymbolicRegression.jl. Can I use this idea at all? If so, how?

Personally, I’m finding this whole topic quite confusing. It seems challenging to attach ownership to a syntactic variable a la rust, because variables don’t really mean anything in Julia. The sequence y = 1; y = 2 isn’t mutating y, but rather it’s deciding that you have a better use for the name y and re-purposing the identifier. As far as Julia is concerned, y itself will probably never even exist, let alone be set twice. In fact y isn’t a reference — it’s just a name!

Similarly, in A = [1]; B = A, A and B don’t really alias one another in the typical sense — they are eachother. They’re two names for the same thing.

That doesn’t mean that I object to the whole idea of borrow-checking, of course, it just seems like it’d be a more natural fit to attach a borrow-checker to the object itself as opposed to whatever name a programmer happened to have given it. It feels like a very ScopedValues sort of operation — you effectively want to set a mutex within a particular scope to allow for mutations, yeah?

4 Likes

Yes, this is exactly the challenge I’m trying to address. Julia’s variable semantics make it impossible to directly replicate Rust’s model, which is why I’ve focused on wrapping values in types that can enforce ownership/borrowing rules.

Which is exactly why race conditions can be so tricky to catch in Julia. Having explicit ownership tracking helps prevent these issues by making shared mutable state more obvious.

This is what’s already done in the GitHub package I put up - there are four types: Owned, OwnedMut, Borrowed, and BorrowedMut. They are built to “simulate” Rust’s lifetime and ownership model. The macros are just shorthands for interacting with these. The discussion about const is just about trying to find the best semantics to nudge the user to write code in a certain way.

I should note generally I’m not looking to evangelize borrow checkers or debate their use. Perhaps my posts come off as me trying to “sell them”; if so, that is simply a side of effect of me liking them this much, but I’m sincerely not trying to force them on anyone. I’ve found them incredibly useful for my own work doing asynchronous processing on large structures, which is why I started this thread to brainstorm and research ways to get similar behavior in Julia. And if we get a package together that others can use, then all the better!

2 Likes

Yeah, I get that. I’m really not pushing back, I’m just confused. The fact that Julia’s assignments are just names is pretty closely connected to how Julia doesn’t expose its own native pointers/references at all and how it passes by object. Sure, a macro (or really, I think you’d want to use a whole-block macro) could change what assignment means, but that’s pretty fundamental to understanding what code does. Which is probably why I’m confused, especially once you start talking about immutable references in a language that doesn’t expose anything about its own references in the first place.

2 Likes

I think you might be missing the point a bit. Just because Julia handles memory management at a lower level doesn’t mean we can’t build higher-level abstractions around ownership and mutability for correctness checking. We’re not trying to modify Julia’s underlying reference system - we’re creating a wrapper/tracking layer on top for development-time safety checks.

The fact that Julia doesn’t expose native references is irrelevant - we’re not trying to modify how Julia works internally, we’re trying to catch bugs in user code by tracking ownership patterns during development. In fact, I’m planning to design the API so that if you set borrow_checker = "disable" via Preferences.jl, the macros would evaporate into normal Julia calls - since they’re just wrappers around regular Julia code anyway. So there would be zero performance overhead in production.

1 Like

I think Pluto is some precedent for the “don’t rebind names” thing. Given that lots of folks use Pluto without issues, it seems like that is a strictness that can be layered on to regular Julia code without too many issues (assuming you are writing the code to that requirement). Then once you don’t rebind names, it seems like you can try to do some reasoning on top (again like Pluto does for code re-evaluation, but here it’s to check some usages).

3 Likes

But you are changing what assignment means — and then asking about what assignment should mean (which is where I jumped in). That’s why I think you want this to be a whole-block macro.

But I also think the single-scope lexical part of it should be the easier part — I think it could even be implemented at macro-eval time without wrappers or worrying about # TODO: Add other interfaces at all. The harder part is tracking objects being mutated simultaneously in concurrent tasks — and that’s where my feeble brain has a hard time imagining how the lexical tracking here gets you that.

1 Like

Yeah this is totally something we should consider - I assume you mean a block like the following?

@borrow_check function my_safe_fnc(x)
    #= ... =#
end

Then we could actually verify if regular assignment or “safe” assignment was used.


Side note: I really like using

@own const x = 1

to indicate owned immutables, because LanguageServer.jl seems to accurately flag reassignment! (admittedly, it’s for a different reason, but it seems practically useful)

I’ve rewritten the API to be as follows:

# Create a new owned immutable variable
@own const x = value  

# Create a new owned mutable variable
@own y = value  

# Transfer ownership from one variable to another, 
# creating an immutable variable
@move const x2 = x  

# Transfer ownership from one variable to another, 
# creating a mutable destination
@move y2 = y  

@lifetime lt begin
    # Create an immutable reference to owned variable `x2` 
    # and assign it to `rx` within the given lifetime scope `lt`
    @ref const rx = x2 in lt  
    @ref const rx2 = x2 in lt  # Multiple immutable refs are fine!
    
    # Create a mutable reference to owned mutable variable `y2` 
    # and assign it to `ry` within the given lifetime scope `lt`
    @ref ry = y2 in lt  # Only one mutable at a time

    # Can mutate `ry` here, but NOT `y2`!
end

# Now we can mutate `y2` again
1 Like

@mbauman I wrote up a a toy example to help illustrate the use of ownership checking for thread safety. First, regular code:

increment_counter!(ref::Ref) = (ref[] += 1)

function create_thread_race()
    # (Oops, I forgot to make this Atomic!)
    shared_counter = Ref(0)

    Threads.@threads for _ in 1:10000
        increment_counter!(shared_counter)
    end

    println("Final counter value: $(shared_counter[])\nExpected value: 10000")
end

If you run this, you get some variation of:

julia> create_thread_race()
Final counter value: 8306
Expected value: 10000

This is bad because it’s a silent error. Unless we test specifically for this value, or are able to spot such errors by eye 100% of the time, we might miss these thread race conditions.

If you had used an ownership system to write this, it would help inform you of the design flaw. Here’s a contrived example (tons of ways to cheat the system right now!!) –

using BorrowChecker

function bc_create_thread_race()
    # (Oops, I forgot to make this Atomic!)
    @own shared_counter = Ref(0)

    Threads.@threads for _ in 1:10000
        increment_counter!(@take shared_counter)
    end

    println("Final counter value: $(shared_counter)\nExpected value: 10000")
end

Running this will give you the following issue:

julia> bc_create_thread_race()
ERROR: TaskFailedException
    nested task error: Cannot use shared_counter: value has been moved
    Stacktrace:

This error occurs because we have already unwrapped shared_counter in another thread, so the variable is marked moved and can’t be read anymore. This pattern forces you to rethink the design of your code to one that doesn’t rely on shared mutable variables. For example, here’s one way that satisfies the borrowing rules by splitting up work beforehand, and aggregating results at the end:

function counter(thread_count::Integer)
    @own local_counter = 0
    for _ in 1:thread_count
        @set local_counter = local_counter + 1
    end
    @take local_counter
end

function bc_correct_counter()
    @own const num_threads = 4
    @own const total_count = 10000
    @own const count_per_thread = total_count ÷ num_threads
    @own tasks = Task[]
    for t_id in 1:num_threads
        @own const thread_count = count_per_thread + (t_id == 1) * (total_count % num_threads)
        @own const t = Threads.@spawn counter($(@take thread_count))
        push!(tasks, @take(t))
    end
    return sum(map(fetch, @take(tasks)))
end

Which gives us the correct result

julia> bc_correct_counter()
10000

There are still some limitations to note. For example, Threads.@spawn doesn’t actually require us to @take the thread_count, so you could “cheat” really easily here. But committing to the ownership system might help in enforcing better patterns.

It would be really nice to have automatic @take and borrows here to make it less verbose. It also seems crucial to have a way of enforcing that the user, via escape analysis (like the page @Mason pointed to), to @take or @ref any variables going into the Threads.@spawn! Likewise, for calling external functions it would be nice to enforce such things.

I mean it would be even better to do the whole thing automatically but I have a feeling this might not be possible.

1 Like

Does anyone know if there a mechanism to override values getting passed to Threads.@spawn / Threads.@threads? Maybe I just have to write a BoundaryChecker.Threads.@spawn that parses and wraps variables in @take? Though it seems a bit fragile this way.

Edit: will just manually check .threadid == Threads.threadid() each time anything is accessed.

1 Like

I think @own should just outright disallow Julia reassignment via the outer macro call. Your borrow-checker is trying to use wrappers of objects as a proxy for references to variables because there’s no way to reference a variable within the same scope in Julia, let alone borrow. You want users to interact with those wrappers, not with Julia variables; the outer macro can transform reassignment syntax to operations on said wrappers. I think the more your syntax deviates from Julia variable semantics’, the better.

Another thing is that you can’t “turn off” the borrow-checking macros for production. I mentioned this much earlier but it should be more clear now that if you removed the macro calls (or perhaps made an outer macro that only ignores inner macro keywords), you’re losing the wrappers’ Rust semantics, and Julia syntax cannot reproduce equivalent behavior, even minus the Rust safety. These macros, and macros generally, change code behavior, so it cannot be a static analysis switch.

This I like, especially because it’s frequent advice in Julia. It seems possible to have an optional static check for shared mutables without proper locking, so I’d prefer that over wrapping all my objects.

1 Like

That’s for performance (helping the optimizer generate better code), not correctness. Also, IMO it’s unnecessary, an interface like UnsafeAssume.jl is more general.

1 Like

We think alike! I wrote up this feature in literally 1 hour ago ago :smiley:

julia> @own a = [1, 2, 3]
OwnedMut{Vector{Int64}}([1, 2, 3])

julia> b = a  # This is illegal - should use @move
OwnedMut{Vector{Int64}}([1, 2, 3])

julia> @take(b)  # The symbol will be validated by many operations
ERROR: Variable `b` holds an object that was reassigned from `a`.
Regular variable reassignment is not allowed with BorrowChecker. Use `@move` to transfer ownership or `@set` to modify values.

Similar checks for @move, @set, and @ref. This is the new type produced by an @own:

mutable struct OwnedMut{T}
    value::T
    moved::Bool
    immutable_borrows::Int
    mutable_borrows::Int  # (can probably just be a Bool)
    const symbol::Symbol
    const threadid::Int

    function OwnedMut(value::T, moved::Bool=false, symbol::Symbol=:anonymous) where {T}
        return new{T}(value, moved, symbol, Threads.threadid())
    end
end

You will see it also now tracks the thread it was created on. This is so it can emulate the behavior of Rust in forcing ownership change (or passing reference) to threads.

I think we could via Preferences.jl. I do something similar for DispatchDoctor.jl. The user writes dispatch_doctor_mode = "disable" and all the macros become a no-op. (Or, more generally, you would leave it as a no-op and only enable it locally or for tests).

With, e.g., a LocalPreferences.toml file set to

[MyPackage]
borrow_checker = "disable"

the function

function bc_correct_counter()
    @own const num_threads = 4
    @own const total_count = 10000
    @own const count_per_thread = total_count ÷ num_threads
    @own tasks = Task[]
    for t_id in 1:num_threads
        @own const thread_count = count_per_thread + (t_id == 1) * (total_count % num_threads)
        @own const t = Threads.@spawn counter($(@take thread_count))
        push!(tasks, @take(t))
    end
    return sum(map(fetch, @take(tasks)))
end

would expand to

function bc_correct_counter()
    num_threads = 4
    total_count = 10000
    count_per_thread = total_count ÷ num_threads
    tasks = Task[]
    for t_id in 1:num_threads
        thread_count = count_per_thread + (t_id == 1) * (total_count % num_threads)
        t = Threads.@spawn counter($(thread_count))
        push!(tasks, t)
    end
    return sum(map(fetch, tasks))
end

losing any performance overhead of the original.

I think this is why it’s important that the macros wrap regular Julia code that would be valid otherwise - literally so that they can just turn into such code when the borrow checker is turned off.

1 Like

P.S., for those who speak Rust, the equivalences are roughly as follows:

  • Borrowed{T} ~ &T
  • BorrowedMut{T} ~ &mut T

So if you want to allow immutable references in a function, you could write

function readonly_op(a::Union{A,Borrowed{A}}) where {A<:Vector}
    sum(i -> a[i]^2, eachindex(a))
end

One thing I’m happy with is that getproperty will return a wrapper of the underlying object, and try to get the semantics right. So, just like Rust, accessing a field of an immutable reference is another immutable reference - even if the underlying object is mutable!

julia> mutable struct A
           a::Float64
       end;

julia> @own x = [A(1.0) for _ in 1:10]; typeof(x)
OwnedMut{Vector{A}}

julia> @lifetime lt let
           @ref const rx1 = x in lt
           @ref const rx2 = x in lt  # can have multiple immutable refs!
           
           @info "types" typeof(rx1) typeof(rx1[2]) typeof(rx1[2].a)
       end
┌ Info: types
│   typeof(rx1) = Borrowed{Vector{A}, OwnedMut{Vector{A}}}
│   typeof(rx1[2]) = Borrowed{A, OwnedMut{Vector{A}}}
└   typeof((rx1[2]).a) = Borrowed{Float64, OwnedMut{Vector{A}}}

This will block you from trying to mutate through it:

julia> @lifetime lt let
           @ref const rx1 = x in lt
           @ref const rx2 = x in lt  # can have multiple immutable refs!
           
           rx1[2].a = 2.0
       end
ERROR: Cannot write to immutable reference

Users manually duplicate annotated types to always support the wrappers? How would that work for methods that aren’t borrow-checked? For example, eachindex(::A) probably exists, eachindex(::Borrowed{A}) probably does not. Dispatch doesn’t support forwarding an arbitrary callable to a field of a wrapper type (like (f::Any)(x) = f(x.y) ), and even if it could, you don’t want that for calls of borrow-checked methods. Earlier you mentioned arguments would be unwrapped before calls to avoid this problem, limiting the borrow-checking to the caller scope in the macro call.