Designing a borrow checker for Julia

Hi everyone! I’m working on BorrowChecker.jl, a package that brings a layer of Rust-like ownership and borrowing semantics to Julia to improve memory safety of user code. Before making an official announcement or releasing version 0.1.0, I’m interested in getting some more feedback on the design.

First, What is ownership?

In Julia, objects exist independently of the variables that refer to them. When you write x = [1, 2, 3] in Julia, the actual object lives in memory completely independently of the symbol, and you can refer to it from as many symbols as you want:

x = [1.0, 2, 3]
y = x
println(length(x))  # Works fine!

Rust works quite differently. The equivalent code is invalid in Rust:

let x = vec![1, 2, 3];
let y = x;
println!("{}", x.len());
// error[E0382]: borrow of moved value: `x`

Why? Because in Rust, objects are owned by variables. When you write let y = x, the ownership of vec![1, 2, 3] is moved to y. Now x is no longer allowed to access it.

In Rust, you would fix this with:

let y = x.clone();  // Make a copy
// OR
let y = &x;  // Borrow x (which freezes the state)

The purpose of this “ownership” paradigm is to improve code safety. Especially in complex, multithreaded codebases, it is really easy to shoot yourself in the foot and read objects which are editable by something else. Rust’s ownership model makes it so that you can prove memory safety of code.

BorrowChecker.jl is a layer that demonstrates these same concepts in Julia. The previous example, rewritten with BorrowChecker syntax, would now hit an error:

julia> @own x = [1, 2, 3];

julia> @own y = x;

julia> println(length(x))
ERROR: Cannot use x: value has been moved

To fix this, you would write:

@clone y = x  # Copies

# OR

@lifetime lt begin
    @ref ~lt y = x  # Borrow x within lt (which freezes the state)
end

While BorrowChecker.jl can’t provide the same compile-time guarantees as Rust, it can help catch memory safety issues during development and testing.

I’ve started dogfooding BorrowChecker on a branch of SymbolicRegression.jl and it has ALREADY resulted in me patching a potential bug due to re-using a mutable container. So I feel like it will be pretty useful.

Ok, now let’s look at the proposed syntax in detail:

Proposed Syntax

Core Ownership Macros

  • @own [:mut] x = value - Create a new owned value
    • Creates an immutable wrapper Owned{typeof(value)}
    • With :mut, creates a mutable OwnedMut{typeof(value)}
  • @move [:mut] new = old - Transfer ownership between variables
  • @clone [:mut] new = old - Create a deep copy without moving
  • @take! var - Unwrap the value, marking original as moved
  • @take var - Unwrap the value, without moving original (does a deepcopy)

References and Lifetimes

  • @lifetime lt begin ... end - Create a reference lifetime scope
  • @ref ~lt [:mut] var = value - Create a reference within lifetime
    • Creates Borrowed{T} or BorrowedMut{T}
    • Use OrBorrowed{T} and OrBorrowedMut{T} in function signatures

Assignment and Loops

  • @set x = value - Assign to existing mutable owned variable
  • @own [:mut] for var in iter - Loop with ownership of elements
  • @ref ~lt [:mut] for var in iter - Loop with references to elements

Experimental Features

  • BorrowChecker.Experimental.@managed begin ... end - Automatic ownership transfer
    • Uses Cassette.jl for recursive ownership handling
    • (Currently experimental due to Cassette.jl limitations with SIMD and other compiler features)
    • Help wanted! If you’re interested in fixing Cassette.jl, this feature could be made stable. Otherwise, we likely need to remove this feature for Julia 1.12.

Examples

Basic Ownership

Let’s look at the basic ownership system. When you create an owned value, it’s immutable by default:

@own x = [1, 2, 3]
push!(x, 4)  # ERROR: Cannot write to immutable

For mutable values, use the :mut flag:

@own :mut data = [1, 2, 3]
push!(data, 4)  # Works! data is mutable

Note that various functions have been overloaded with the write access settings, such as push!, getindex, etc.

The @own macro creates an Owned{T} or OwnedMut{T} object. Most functions will not be written to accept these, so you can use @take (copying) or @take! (moving) to extract the owned value:

# Functions that expect regular Julia types:
push_twice!(x::Vector{Int}) = (push!(x, 4); push!(x, 5); x)

@own x = [1, 2, 3]
@own y = push_twice!(@take!(x))  # Moves ownership of x

push!(x, 4)  # ERROR: Cannot use x: value has been moved

However, for recursively immutable types (like tuples of integers), @take! is smart enough to know that the original can’t change, and thus it won’t mark it moved:

@own point = (1, 2)
sum1 = write_to_file(@take!(point))  # point is still usable
sum2 = write_to_file(@take!(point))  # Works again!

This is the same behavior as in Rust (c.f., the Copy trait).

There is also the @take(...) macro which never marks the original as moved,
and performs a deepcopy when needed:

@own :mut data = [1, 2, 3]
@own total = sum_vector(@take(data))  # Creates a copy
push!(data, 4)  # Original still usable

Note also that for improving safety when using BorrowChecker, the macro will actually store the symbol used. This helps catch habits like:

julia> @own x = [1, 2, 3];

julia> y = x;  # Unsafe! Should use @clone, @move, or @own

julia> @take(y)
ERROR: Variable `y` holds an object that was reassigned from `x`.

This won’t catch all misuses but it can help prevent some.

References and Lifetimes

References let you temporarily borrow values. This is useful for passing values to functions without moving them. These are created within an explicit @lifetime block:

@own :mut data = [1, 2, 3]

@lifetime lt begin
    @ref ~lt r = data
    @ref ~lt r2 = data  # Can create multiple _immutable_ references!
    @test r == [1, 2, 3]

    # While ref exists, data can't be modified:
    data[1] = 4 # ERROR: Cannot write original while immutably borrowed
end

# After lifetime ends, we can modify again!
data[1] = 4

Just like in Rust, while you can create multiple immutable references, you can only have one mutable reference at a time:

@own :mut data = [1, 2, 3]

@lifetime lt begin
    @ref ~lt :mut r = data
    @ref ~lt :mut r2 = data  # ERROR: Cannot create mutable reference: value is already mutably borrowed
    @ref ~lt r2 = data  # ERROR: Cannot create immutable reference: value is mutably borrowed

    # Can modify via mutable reference:
    r[1] = 4
end

When you need to pass immutable references of a value to a function, you would modify the signature to accept a Borrowed{T} type. This is similar to the &T syntax in Rust. And, similarly, BorrowedMut{T} is similar to &mut T.

There are the OrBorrowed{T} (basically ==Union{T,Borrowed{<:T}}) and OrBorrowedMut{T} aliases for easily extending a signature. Let’s say you have some function:

struct Bar{T}
    x::Vector{T}
end

function foo(bar::Bar{T}) where {T}
    sum(bar.x)
end

Now, you’d like to modify this so that it can accept references to Bar objects from other functions. Since foo doesn’t need to mutate bar, allowing immutable references as arguments is a nice way to ensure your code is parallel-safe. We can modify this as follows:

function foo(bar::OrBorrowed{Bar{T}}) where {T}
    sum(bar.x)
end

Now, we can modify our calling code (which might be multithreaded) to be something like:

@own :mut bar = Bar([1, 2, 3])

@lifetime lt begin
    @ref ~lt r1 = bar
    @ref ~lt r2 = bar
    
    @own tasks = [
        Threads.@spawn(foo(r1)),
        Threads.@spawn(foo(r2))
    ]
    @show map(fetch, @take!(tasks))
end

# After lifetime ends, we can modify `bar` again

The immutable references ensure that (a) the original object cannot be modified during the lifetime, and (b) there are no mutable references active, giving us confidence in accessing the same data from multiple threads.

Automatic Ownership

Lastly, there’s also the @managed macro that uses Cassette.jl overdubbing to automatically move owned values. This is highly experimental and with Cassette.jl unmaintained, I will probably remove this. But just so you can see it:

This block can be used to perform borrow checking automatically. It basically transforms all functions, everywhere, to perform @take! on any function call that take Owned{T} or OwnedMut{T} arguments (or any properties of such arguments)

struct Particle
    position::Vector{Float64}
    velocity::Vector{Float64}
end

function update!(p::Particle)
    p.position .+= p.velocity
    return p
end

With @managed, you don’t need to manually move ownership:

julia> using BorrowChecker.Experimental: @managed

julia> @own :mut p = Particle([0.0, 0.0], [1.0, 1.0])
       @managed begin
           update!(p)  # p::Owned{Particle} is automatically unwrapped
       end;

julia> p
[moved]

I think this direction is interesting but the tools are probably too unstable at the moment so I recommend using the manual API. But perhaps in the future this sort of thing could be combined with the improving code escape analysis (cc @Mason) to automatically generate the @lifetime blocks, which is similar to what Rust’s compiler does. Perhaps the OrBorrowed{T} arguments could even be grafted onto arguments automatically somehow.

Disabling the System

Something that has been useful for DispatchDoctor.jl was allowing the debugging layer to be disabled during production usage. We do something simliar here

module MyLibrary
    using BorrowChecker: disable_by_default!
    disable_by_default!(@__MODULE__)

    #= rest of library afterwards =#
end

To then enable the borrow checker in a test, you would add the following option to your test/Project.toml file:

[preferences.MyLibrary]
borrow_checker = true

This is the reason why all of the above syntax just wraps regular ol’ Julia code. So that when you write

@own x = [1, 2, 3]
@clone :mut y = x

@own output = foo!(@take!(y))

@lifetime lt begin
    @ref ~lt r = x
    println("sum:", sum(r))
end

and disable the borrow checker, the macros will evaporate to

x = [1, 2, 3]
y = maybe_deepcopy(x)  # [only deepcopys if has any mutable component]

output = foo!(y)

let
    r = x
    println("sum:", sum(r))
end

Leaving out the overhead of the runtime borrow checker.

Feedback Requested

I’d particularly appreciate feedback on what people think about the macro syntax. The package is currently in development at GitHub - MilesCranmer/BorrowChecker.jl: A borrow checker for Julia. I plan to make an official announcement once I’ve incorporated community feedback and polished the interface.

Looking forward to hearing people’s thoughts!

27 Likes


(true origin story)

17 Likes

Congratulations! BorrowChecker.jl is featured on Hacker News.

1 Like

Here’s what adapting to it looks like in practice:

For example:

with other necessary changes like

 function HallOfFame(
-     options::AbstractOptions, dataset::Dataset{T,L}
+     options::OrBorrowed{AbstractOptions}, dataset::OrBorrowed{Dataset{T,L}}
 ) where {T<:DATA_TYPE,L<:LOSS_TYPE}

or

 function strip_metadata(
     ex::AbstractExpression,
-    options::AbstractOptions,
+    options::OrBorrowed{AbstractOptions},
-    dataset::Dataset{T,L},
+    dataset::OrBorrowed{Dataset{T,L}},
 ) where {T,L}

to lets these functions take immutable references to the passed values.


I’m wondering if this OrBorrowed is good enough or if I should repurpose @ref for types too, like

    options::@ref(AbstractOptions),

or for mutable

    ex::@ref(:mut, AbstractExpression),

but not sure. I feel like this might be making it look more complicated than it is.

1 Like

Congratulations!

Out of curiosity, will BorrowChecker impact runtime performance?

1 Like

For identical code, it will worsen performance due to the extra wrappers and tracking of object lifetimes and owners. How much depends on the specific code. However:

  1. You should normally use disable_by_default!(@__MODULE__) in a package. Then you only need to have the borrow checker on during testing (via Preferences.jl).
  2. Using BorrowChecker might allow you to avoid a copy where you didn’t actually need one, because it gives you confidence a variable won’t be mutated. In this case, it would result indirectly improve performance via code changes.

But for identical code it won’t improve performance. Unless we set up automatic finalisations at cleanup! stage of a lifetime, which would let us skip the GC sometimes. But I doubt it would have much of a benefit.

Maybe @Mason can give some tips on EscapeAnalysis. There could potentially be other ways to skip the GC sometimes. (Like how Rust does automatic memory management based only on escape analysis => object lifetimes)

1 Like

Out of curiosity, and as a relative newcomer to Julia, does Julia have a significant memory leak issue that isn’t remedied by garbage collection?

Julia is a fairly normal GCed language here. Having a GC means (absent language bugs) that the memory leaks that exist in your program our your fault. However, programing languages generally don’t stop you from writing programs that use more memory than you intended to, and programs that over time hold on to more and more memory is a fairly common logic bug.

2 Likes

In other words, Oscar_Smith is distinguishing syntactic and semantic garbage. Syntactic garbage is unreachable (from some point in your program via any references); this is what garbage collectors deal with. Semantic garbage includes syntactic garbage, but it also includes reachable memory that the program just won’t access for the rest of its lifetime; garbage collectors don’t deal with this because identifying it is undecidable. Even when you can prove you won’t ever touch a reference again in a segment of code, an interactive language like Julia can’t assume you won’t execute more code that does in the same session. It is thus “your fault”, or rather your responsibility to make sure references end with local scopes or to remove obsolete objects from persistent global variables.

Languages designed to be garbage-collected generally use precise garbage collectors, that is they are intended to always identify live references and unreachable memory. If a collection frees reachable memory or fails to free unreachable memory when checked, then it’s a bug. Since you asked this on a topic about Rust’s borrow checker and more generally ownership, it’s worth pointing out that safe Rust isn’t designed to free all unreachable memory, or in their words:

Preventing memory leaks entirely is not one of Rust’s guarantees, meaning memory leaks are memory safe in Rust.

Granted, that distinction often surprises people, but it’s true that safe Rust promises to stop you from reading and writing memory at the wrong places. In practice, Rust’s automatic memory management is also solid enough that it’d be a shock to accidentally cause an unreachable memory leak. Off the top of my head, the “easiest” way is a cycle of (strong) references, but Rust’s ownership makes it hard to just stumble into it. There are also ways to accomplish some goals at negligible cost to avoid dealing with reference cycles, e.g. a linked data structure implemented as a vector referencing all the nodes that store the vector’s indices rather than fully independent nodes that reference each other.

4 Likes