[ANN] BorrowChecker.jl: a borrow checker for Julia

I see. I’m not sure this would be feasible because Julia does not have syntax for per-object mutability, borrowing, and lifetimes.

In theory you could figure these out and generate them by analysing the LLVM IR, but then you wouldn’t be able to get errors related to violating the declared expectations. Stuff like “modified an immutable.” So it would have far less practical value.

In your example, the macro could actually make it work with:

@own x = [1, 2, 3]
@lifetime lt begin
    @ref ~lt y = x
    @bc length(x)
end

Because making y a reference to x would not result in a move. Then calling length on a reference of x is also valid, because you can have multiple immutable references active!

In other words, it’s a bit ambiguous which way to “rustify” the code. Which is why I think the additional syntax is fundamentally needed.

1 Like

Right yeah exactly - I’d still keep @lifetime and @ref etc., my point was that by having a top-level macro one could move some/all of the run-time checks currently in BorrowChecker.jl into syntactic/type-domain checks.

I think doing

x = Moved()

Is not the optimal strategy, because Julia can capture variables in closures and threads. Assigning x to a new variable would therefore not void those captured variables.

Like:

@own x = [1,2,3]
t = Threads.@spawn (sleep(1); length(x))
@own y = x
fetch(t)

If you were to rely on Julia assignment to indicate a move, you wouldn’t get an error here. But since a move lives in value space (with is_moved(x) = getfield(x, :moved)), it can successfully throw an error when the thread tries to use x after having been moved!

This is fundamentally why I made moves a dynamic property rather than type domain.

1 Like

IIRC from the brainstorming thread, there were also associated usability issues. Local const makes valid Julia expressions and was considered for distinguishing let and let mut in a top-level BorrowChecker macro, but attempting to evaluate local const throws a syntax error. So the macro’s “off” state can’t actually be a no-op, it has to undo currently invalid Julia syntax that could eventually become valid Julia syntax we don’t want to undo anymore. Even if we instead used custom macro-keywords to avoid the uncertain state of local const, this heavily complicates the macro implementation for no real savings in typing.

A top-level macro also has the potential to do the same thing (well, maybe not type-domain checks because it’s all expressions at that point) for the current state of line-wise macros, it doesn’t demand those be absent. Given the difference between Julia and Rust syntax, those line-wise macros might actually be critical.

1 Like

Under my hypothesised new macro, your code would get transformed to something very approximately like

x = Owned([1,2,3])
t = Threads.@spawn (sleep(1); rusty_call(length, x))
x = Moved() # due to length(x) a line up
y = Owned(rusty_take(x))
fetch(t)

and in particular the assignment to y would rightly fail because x was moved.

Anyway I was just hypothesising to see if you had considered a similar approach, I’ll stop derailing your thread now :slightly_smiling_face:

This still wouldn’t work, because length doesn’t result in a move; it only needs a reference! (Same as .len() in rust)

A lot of functions in Base have overloads to work with owned/borrowed variables in a way that will automatically do such things without needing the @bc or @take calls.

No not to worry, these questions are great for clarifying design decisions!! The current design is still experimental and am very open to suggestions for improvement :slightly_smiling_face:

Not sure if you intended this or not (I’ve never used Rust and hardly understand ownership semantics), but do realize that rusty_call(length, x) would likely fail here because it almost certainly happens after x = Moved(). Closures capture variables, not values, so reassignments in the enclosing scope are reflected in the closure (which is why closures so often have to resort to boxing).

Yeah that’s what I meant by “very approximately” - the actual version would use let or something to ensure the original x was what was captured.

1 Like

I’m still not sure I see how this could ever work just with a code block macro though. Is the idea length would result in a move? Or you would manually check for a @spawn operation, and if there is one, then re-assign x to a Moved() type?

For example, what if the length and @spawn occurs within some function? Like this:

function f(x)
    Threads.@spawn begin
        sleep(1)
        open("out.txt", "w") do io
            println(io, length(x))
        end
    end
end

x = [1,2,3]
t = f(x)
y = x
push!(y, 4)
fetch(t)  #= just to see errors =#

With the current BorrowChecker, you would modify the code like this:

@own x = [1,2,3]
@own t = @bc f(x)
@own :mut y = x
push!(y, 4)
fetch(@take!(t))

which correctly gives an error:

nested task error: Cannot use `x`: value has been moved

actually even if there was no move, it would give an error, because the lifetime had automatically ended when f(x) finishes!

julia> @own x = [1,2,3]
Owned{Vector{Int64}}([1, 2, 3], :x)

julia> t = @bc f(x)
Task (runnable, started) @0x0000000118e2a270

julia> fetch(t)
...
    nested task error: Cannot use `x`: value's lifetime has expired

In other words, you cannot capture a reference inside a thread or closure unless the lifetime lasts longer than the thread! :smiley:

So the proper way to do this is:

@own x = [1, 2, 3]
@lifetime lt begin
    @ref ~lt r = x
    @own t = f(r)
    
    @clone :mut y = x
    push!(y, 4)

    # need to fetch _within_ the lifetime!
    fetch(@take!(t))
end

This code is finally valid, and runs without error.

If we tried to fetch after that lifetime, we’d hit another

end
fetch(@take!(t))

#    nested task error: Cannot use `r`: value's lifetime has expired

error. Now that’s some safe code!!

Also, if we tried to create a mutable reference, even the clone is smart enough to throw an error:

julia> @own :mut x = [1, 2, 3]
       @lifetime lt begin
           @ref ~lt :mut r = x
           @own t = f(r)
           @clone :mut y = x
           push!(y, 4)
           fetch(@take!(t))
       end
ERROR: Cannot access `x` while mutably borrowed

Hell ya!

Though in thinking through this, I’m realising that variable capture is still not safe to use in this context with a borrow checker!

It should really be blocked for owned variables, and mutable references. When you capture an owned variable in a closure (which is done internally when you call Threads.@spawn), it should be marked as moved. Similarly, you should not be able to pass a BorrowedMut variable, because you are not allowed to have two mutable references active!

I’ve posted a feature request for some kind of mechanism in Julia here: Feature request: hook or mechanism to intercept closure capture for specific types · Issue #58090 · JuliaLang/julia · GitHub

You may be interested in OhMyThreads.jl’s implementation of a closure box blocker: OhMyThreads.jl/src/implementation.jl at 2affc744e5ca213c74e6fda99ea8a5ad6cef0edf · JuliaFolds2/OhMyThreads.jl · GitHub

This only blocks boxed captures, though. Sounds like you need to block or mark moved any captured variable.

2 Likes

Cool! Is the idea here that you would wrap every closure with a function that checks internal types for potential Core.Box?

I think wrapping closure functions with some sort of checker would be one decent option. Though not sure how to access such internal info? How do you actually see what variables get captured? I tried fieldtypes(typeof(f)) but no such luck.

This seems to work?

julia> make_closure(x) = () -> x
make_closure (generic function with 1 method)

julia> f = make_closure(1.0)
#1 (generic function with 1 method)

julia> f.x
1.0

julia> fieldtypes(typeof(f))
(Float64,)

OhMyThreads.jl provides higher-order primitives like tmap(f, x), a multithreaded map. Internally, this calls throw_if_boxed_captures(f) to verify that the user didn’t mess up and pass a closure with a boxed capture, since that would likely result in data races in addition to being a performance footgun. The closures created by the package itself when spawning tasks are already written to be box free and don’t need this check (if I understand correctly, I’m just a rando perusing code on the internet).

I’m not sure if/how this or a similar function could be useful for BorrowChecker.jl, but it seemed related to your concerns.

2 Likes

fieldnames is what gets you the names of the internal fields, which currently match the captured variables, but I’m partial to dump’s printout:

julia> let a=1, b=2, c=3
         b = 4
         x -> a*b+x
       end |> dump
#11 (function of type var"#11#12"{Int64})
  a: Int64 1
  b: Core.Box
    contents: Int64 4
1 Like

Many thanks @danielwe and @Benny! This is great. I’ve got a PR here: Create `@cc`, a closure check macro by MilesCranmer · Pull Request #14 · MilesCranmer/BorrowChecker.jl · GitHub that defines a @cc macro for doing “closure checks” and validating no owned or mutable borrows are ever captured in a closure.

julia> let
           @own :mut x = [1, 2, 3]
           f = @cc (a, b) -> sum(x) + a + b
       end
ERROR: The closure function captured a variable `x::OwnedMut{Vector{Int64}}`.
This is disallowed because variable captures of owned and mutable references
breaks the rules of the borrow checker. Only immutable references are allowed.
To fix this, you should use `@ref` to create an immutable reference to the
variable before capturing.

It’s still not ideal, because it does require wrapping every closure with a @cc, but until there’s a hook added to the Julia lowering, I think this is the only option.


P.S., I also added a better description of the ownership rules, to help those who are new to borrow checkers

2 Likes

Just added a Mutex feature for safely working with shared data!

This might be the most practical feature added so far. It lets you do multi-threaded writes in a very safe way. (The rules are inspired from Rust’s std::sync::Mutex)

So, right off the bat, since a Mutex object is safe to share an unlimited number of times (it’s basically an immutable reference that can generate mutable references under a lock), you no longer need the verbose @own macro!

julia> m = Mutex([1, 2, 3])
Mutex{Vector{Int64}}([1, 2, 3])

julia> m2 = m  # this is discouraged with `Owned` objects, but here, it's safe

You can use regular Julia assignment syntax with this. You can capture it in a closure, put it a million times into a vector, etc.

This is because, to actually edit the protected object, you need to first hold the lock:

julia> lock(m);

julia> @ref_into :mut data = m[]
BorrowedMut{Vector{Int64},OwnedMut{Vector{Int64}}}([1, 2, 3, 4], :data)

julia> push!(data, 4);

julia> unlock(m);

julia> m
Mutex{Vector{Int64}}([1, 2, 3, 4])

The beauty of this is that the locking-unlocking determines the lifetime. Once you release the lock, all the references you created to the guarded value expire! You can’t touch the data anymore through those references:

julia> data
[expired reference]

julia> data[1]
ERROR: Cannot use `data`: value's lifetime has expired

In other words, it has self-destructing references! This makes it very safe to use.

Even if you are good about making a lock around your mutable container, there is always a chance that it becomes unprotected from the lock at some point (maybe gets stored in some closure or other mutable container, etc.). The Borrowed objects are nice because they can act like a normal object, but have their access to data revoked at any time once the lifetime expires! And the new Mutex ensures the lifetime only lasts as long as the lock is held.

Of course, this isn’t doing this all at a compiler level (maybe one day :-)), but I think it’s quite a nice model for safe access.

For a multi-threaded example, showing an arbitrary mutable struct:

julia> mutable struct A
           a::Int
       end

julia> m = Mutex(A(0))
Mutex{A}(A(0))

julia> @sync for i in 1:100
           Threads.@spawn begin
               Base.@lock m begin
                   @ref_into :mut r = m[]
                   r.a += 1
               end
           end
       end

julia> m
Mutex{A}(A(100))

All those references will expire (useful since they get stored in the closure generated by Threads.@spawn! We also can’t access the inner value without a lock:

julia> m[]
ERROR: LockNotHeldError: Current task does not hold the lock for this Mutex.
Access via `m[]` is only allowed when the lock is held by the current task.

@Mason this might also be the right way to integrate it in Bumper.jl. Once the bump allocation arena is exited, you could end the lifetime, and that will kills off any references to the array storage that might be hanging around.

3 Likes