[ANN] Automatic borrow checking at the compiler level, with BorrowChecker.Auto

image

Presenting… BorrowChecker.Auto!

Happy to share BorrowChecker.jl’s Auto layer, which comes with a macro BorrowChecker.@auto, that forms an automatic best-effort Rust-like borrow checker that runs directly on Julia’s SSA-form IR without the need for extra syntax.

For example:

julia> import BorrowChecker

julia> BorrowChecker.@auto function f()
           x = [1, 2, 3]
           y = x
           push!(x, 4)
           return y
       end
f (generic function with 1 method)

julia> f()  # errors
ERROR: BorrowCheckError for specialization Tuple{typeof(f)}

  method: f() @ Main REPL[7]:1

  [1] stmt#7: cannot perform write: value is aliased by another live binding      at REPL[7]:4
        2         x = [1, 2, 3]
        3         y = x
      > 4         push!(x, 4)
        5         return y
        6     end

      stmt: Main.push!(%5, 4)

The goal of this macro is a drop-in tripwire for existing Julia code: flag surprising aliasing / escape patterns during development, without changing program behavior when valid.

@auto is designed to catch two big classes of hard-to-debug issues, loosely inspired by Rust’s ownership and lifetime model:

  • Aliasing violations :prohibited:: : mutating a value while another live binding may observe that mutation.
  • Escapes / “moves” :prohibited:: storing a mutable value somewhere that outlives the current scope (e.g. a global cache / a field / a container), then continuing to reference it locally.

Avoiding these issues sometimes requires refactoring or redesign, but this typically leads to safer, more robust code.

One nice difference here is that in Rust, you would need to explicitly indicate mut to declare something is mutable, whereas here, mutability is inferred! This is done by analysing the SSA-form IR operations (for example, a Core.setfield call in the SSA-form IR would trigger this). The library of operations it need to register is surprisingly small because of working at this abstraction level (this level of analysis was inspired by Mooncake which does something similar for AD).

When BorrowChecker.Auto cannot determine what a call does, it will be conservative (and may throw false positives) and might assume mutation/escapes.

How it works

When you write:

BorrowChecker.@auto function f(args...)
    # ...
end

the macro rewrites the function so that:

  1. On entry, it runs a borrow check for the current method specialization and caches the result.
  2. The checker asks Julia for the function’s typed compiler IR (the lowered form the compiler, after initial cleanup but before optimization).
  3. It walks that IR and tracks two key things:
    • Which bindings may refer to the same mutable object (aliasing).
    • Which operations write to a tracked object and/or cause it to escape (which gets treated like a move).
  4. When it sees an operation that would be illegal under Rust-like rules (e.g. “write while aliased”, or “use after escape”), it throws a BorrowCheckError with a source-level-ish diagnostic.

Aliasing Detection

BorrowChecker.jl’s @auto macro can detect when values are modified through aliased bindings, and throw an error:

julia> import BorrowChecker

julia> BorrowChecker.@auto function f()
           x = [1, 2, 3]
           y = x
           push!(x, 4)
           return y
       end
f (generic function with 1 method)

julia> f()  # errors

This will generate a helpful error pointing out the location of the borrow check violation, and the statement that violated the rule:

ERROR: BorrowCheckError for specialization Tuple{typeof(f)}

  method: f() @ Main REPL[7]:1

  [1] stmt#7: cannot perform write: value is aliased by another live binding      at REPL[7]:4
        2         x = [1, 2, 3]
        3         y = x
      > 4         push!(x, 4)
        5         return y
        6     end

      stmt: Main.push!(%5, 4)

To fix it, simply copy the value, which will avoid the error:

julia> BorrowChecker.@auto function f()
           x = [1, 2, 3]
           y = copy(x)
           push!(x, 4)
           return y
       end
f (generic function with 1 method)

julia> f()
3-element Vector{Int64}:
 1
 2
 3

You could equivalently just return the x value, which would end the lifetime of y early, and so permit x to be mutated again:

julia> BorrowChecker.@auto function f()
           x = [1, 2, 3]
           y = x            # no copy, but its ok, since we dont use y again!
           push!(x, 4)
           return x
       end
f (generic function with 1 method)

julia> f()
3-element Vector{Int64}:
 1
 2
 3

Escape Detection

Much like Rust’s ownership model, BorrowChecker.jl’s @auto macro attempts to infer when values escape their scope (moved/consumed) and throw an error if they are used afterwards.

julia> const CACHE = Dict()
Dict{Any, Any}()

julia> cache(x) = (CACHE[x] = 1; nothing)
foo (generic function with 1 method)

julia> BorrowChecker.@auto function bar()
           x = [1, 2]
           cache(x)
           return x
       end
bar (generic function with 1 method)

julia> bar()  # errors

This generates the following error:

ERROR: BorrowCheckError for specialization Tuple{typeof(bar)}

  method: bar() @ Main REPL[13]:1

  [1] stmt#6: value escapes/consumed by unknown call; it (or an alias) is used later      at REPL[13]:3
        1     BorrowChecker.@auto function bar()
        2         x = [1, 2]
      > 3         cache(x)
        4         return x
        5     end

      stmt: Main.cache(%5)

Why is this an error? Because x was stored as a key in the cache, but is mutable externally. Furthermore, it is returned by bar! This is a violation of borrowing rules. Once the value gets stored in the cache, its ownership is transferred to the cache, and is no longer accessible by bar. So this code would illegal.

How can we fix it? We have two options. The first is we can copy the value before storing it:

julia> BorrowChecker.@auto function bar()
           x = [1, 2]
           cache(copy(x))
           return x
       end
bar (generic function with 1 method)

julia> bar()  # ok

We no longer have access to the object created by copy(x), so the borrow check passes.
Alternatively, we can use immutable objects, which are safe to pass around:

julia> BorrowChecker.@auto function bar()
           x = (1, 2)
           cache(x)
           return x
       end
bar (generic function with 1 method)

julia> bar()  # ok

No copies needed for immutables.


This macro is still highly experimental. There will be false positives and false negatives. Please share those when you find them! The more examples in the test suite, the better.

Interested to hear what people think.

64 Likes

Very cool! Impressive that you can hack the compiler to do this kind of stuff (both impressive that you have those skills, and that the compiler enables it).

Could you go full Rust mode and put @auto on an entire module?

4 Likes

That is the best logo ever.

18 Likes

One option would be to do something like DispatchDoctor.jl and propagate the macro through all functions in the library. Or (ii) we could do something like AllocCheck.jl and recursively analyze all functions in the stack and assert no borrow violations anywhere.

I did try (ii) but it seemed slow when I did it. I think I just need to figure out how to cache things better though.

Also if anybody has any ideas for how to make the borrow check evaporate after it passes, that would be useful. I’m not sure if there’s a way to actually incorporate the check into the compiler passes or not (GPUCompiler.jl?).

1 Like

Not sure if this is sensible or it is too simplistic, but maybe you could add a version of it which applies the macro only if the code is in a test environment or the code is run interactively. It doesn’t cover all use cases but this may be useful?

Yeah, we could do that. Another option is something via Preferences.jl, similar to what DispatchDoctor.jl does.

Though I benchmarked it today and it is only around 70 ns for the cache lookup. For the kinds of functions people would realistically annotate, that might be fine. You would not bother annotating something like kernel(x) = sin(x), but for longer, branch-heavy functions the overhead is likely negligible.

julia> BC.@auto f(x) = x
f (generic function with 1 method)

julia> @btime f(1.0)
  69.970 ns (0 allocations: 0 bytes)

We could probably get it down even more with per-thread caching to avoid the cache lock.

I’m surprised it can detect this tbh. Not shabby!

julia> import BorrowChecker as BC

julia> BC.@auto function make_updater(x)
           f = () -> sum(x)   # closure captures x
           push!(x, 4)        # mutate x
           return f
       end
make_updater (generic function with 1 method)

julia> f = make_updater([1, 2, 3])
ERROR: BorrowCheckError for specialization Tuple{typeof(make_updater), Vector{Int64}}

  method: make_updater(x) @ Main REPL[2]:1

  [1] stmt#14: cannot perform write: value is aliased by another live binding      at REPL[2]:3
        1     BC.@auto function make_updater(x)
        2         f = () -> sum(x)   # closure captures x
      > 3         push!(x, 4)        # mutate x
        4         return f
        5     end

      stmt: Main.push!(_2, 4)

4 Likes

This is very cool stuff!

I have been pondering something similar but different - a macro that implements a Rust-like DSL, which would be more like a statically-typed language. Your macro is a very neat solution and it’s kinda amazing you can do this just from Julia’s IR.

In your initial example, how does it know that push! will modify x? Is it recursing into the applicable code for push! and finding some core function calls that must imply mutation (like writing to a Memory?

Presumably that example would have passed fine if you did sum(x) instead, since that’s only a read?

Is @auto recursive in the sense that any functions called by the annotated function are also borrow-checked?

What exactly is the ownership model? Does it all just reduce down to saying that structs and Memory own their fields/elements, and since basically everything is built from those, it implies that vectors and dicts and things also own their elements?

2 Likes

Thanks!

Curiously this is also the first thing I thought of! See the second half of the README.md GitHub - MilesCranmer/BorrowChecker.jl: A borrow checker for Julia. Though it does work, it ended up feeling quite invasive and heavy:

using BorrowChecker: @own, @lifetime, @ref

@own :mut data = [1, 2, 3]

@lifetime lt begin
    @ref ~lt r = data
    @ref ~lt r2 = data  # Can create multiple _immutable_ references!
    @test r == [1, 2, 3]

    # While ref exists, data can't be modified:
    data[1] = 4 # ERROR: Cannot write original while immutably borrowed
end

# After lifetime ends, we can modify again!
data[1] = 4

Though I have to say the immutable reference feature is pretty nifty:

mutable struct A
    x::Int
end

@own :mut a = A(0)
for _ in 1:10
    a.x += 1
end
# Move it to an immutable:
@own a_imm = a

a_imm.x += 1
# ERROR: Cannot write to immutable

a.x += 1
# ERROR: Cannot use a: value has been moved

The way this works is that getproperty creates a LazyAccessor{T} objects which delays reading/writing data until the last moment, which is when it actually checks the object’s permissions. During the read, it creates a temporary lifetime and generates either immutable or mutable references. (Which might void certain read/write behavior in the object).

Anyways, it never felt ergonomic enough for regular use. Even Rust can infer lifetimes automatically so it felt painful to write out these @lifetime blocks all over my code.

Luckily, now the tide has turned and BorrowChecker.@auto has a feature that even Rust lacks - inferred mutability! In Rust, you always need to explicitly declare &mut on an argument, here, it just gets inferred by the analysis pass.

1 Like

It analyzes the SSA-form IR for every function and infers three effects:

  1. Does the function modify its arguments, and, if so, which one?
  2. Are any of the return values aliased to any of the arguments, and, if so, which one?
  3. Can any arguments escape (e.g., being stored in a global cache), and, if so, which one?

From these, it figures out what bindings ought to be readable and/or writeable at each point in the code. It applies the same rules as Rust for this: among the live bindings to an object, you can have multiple readers OR one writer, but never both.

Basically it looks like this. Say I want to run the borrow checker on f. So we examine its IR:

julia> using InteractiveUtils: @code_ircode

julia> BorrowChecker.@auto function f()
           x = [1, 2, 3]
           y = x
           push!(x, 4)
           return y
       end
f (generic function with 1 method)

julia> @code_ircode optimize_until="compact 1" f()
 1 ─      nothing::Core.Const((f,))                                              │
 │        nothing::Core.Const(Tuple{typeof(f)})                                  │
 │          dynamic BorrowChecker.Auto.__bc_assert_safe__(Tuple{typeof(f)})::Any │
 │   %4 =   dynamic Base.vect(1, 2, 3)::Vector{Int64}                            │╻ macro expansion
 │   %5 =   dynamic BorrowChecker.Auto.__bc_bind__(%4)::Vector{Int64}            ││
 │   %6 =   dynamic BorrowChecker.Auto.__bc_bind__(%5)::Vector{Int64}            ││
 │          dynamic Main.push!(%5, 4)::Any                                       ││
 └──      return %6                                                              ││
  => Vector{Int64}

For each dynamic call, we have to infer the effects (does it mutate? does it allow escapes? does it return an alias?). Let’s look at Main.push!:

julia> @code_ircode optimize_until="compact 1" push!([1, 2, 3], 4)
1288 1 ─      nothing::Core.Const(Int64)                                                          │
     └──      nothing::Core.Const(true)                                                           │
     2 ─      nothing::Nothing                                                                    │
1289 3 ─ %4 =   dynamic Base._push!(_2, _3)::Vector{Int64}                                        │
     └──      return %4                                                                           │
      => Vector{Int64}

Ok, recurse more

julia> @code_ircode optimize_until="compact 1" Base._push!([1, 2, 3], 4)
1292 1 ─        dynamic Base._growend!(_2, 1)::Any                                                │
1293 │   %2 =   dynamic Base.length(_2)::Int64                                                    │
     │          dynamic Base.__safe_setindex!(_2, _3, %2)::Any                                    │
1294 └──      return _2                                                                           │
      => Vector{Int64}

We’ve got three functions here: _growend!, length, and __safe_setindex!. We have to inspect each one!

julia> x = [1, 2, 3];

julia> @code_ircode optimize_until="compact 1" Base._growend!(x, 1)
1121 1 ─ %1  =   dynamic Base.Int(_3)::Int64                                                      │
1122 │   %2  =   dynamic (%1 >= 0)::Bool                                                          │
     └──       goto #6 if not %2                                                                  │
     2 ─       nothing::Nothing                                                                   │
1123 3 ─ %5  =   dynamic Base.getproperty(_2, :ref)::MemoryRef{Int64}                             │
1124 │   %6  =   dynamic Base.getproperty(%5, :mem)::Memory{Int64}                                │
1125 │   %7  =   dynamic Base.length(%6)::Int64                                                   │
1126 │   %8  =   dynamic Base.length(_2)::Int64                                                   │
1127 │   %9  =   dynamic (%8 + %1)::Int64                                                         │
1128 │   %10 =   builtin Base.memoryrefoffset(%5)::Int64                                          │
1129 │   %11 =   dynamic (%10 + %9)::Int64                                                        │
     │   %12 =   dynamic (%11 - 1)::Int64                                                         │
1130 │   %13 =   dynamic (%7 < %12)::Bool                                                         │
     └──       goto #5 if not %13                                                                 │
1131 4 ─       nothing::Core.Const(Vector{Int64})                                                 │
     │         nothing::Core.Const(Int64)                                                         │
     │         nothing::Core.Const(Int64)                                                         │
     │         nothing::Core.Const(Int64)                                                         │
     │         nothing::Core.Const(Int64)                                                         │
     │         nothing::Core.Const(Int64)                                                         │
     │         nothing::Core.Const(Memory{Int64})                                                 │
     │         nothing::Core.Const(MemoryRef{Int64})                                              │
     │         nothing::Core.Const(Base.var"#_growend!##0#_growend!##1"{Vector{Int64}, Int64, Int64, Int64, Int64, Int64, Memory{Int64}, MemoryRef{Int64}})
     │   %24 = %new(Base.var"#_growend!##0#_growend!##1"{Vector{Int64}, Int64, Int64, Int64, Int64, Int64, Memory{Int64}, MemoryRef{Int64}}, _2, %12, %10, %9, %8, %7, %6, %5)::Base.var"#_growend!##0#_growend!##1"{Vector{Int64}, Int64, Int64, Int64, Int64, Int64, Memory{Int64}, MemoryRef{Int64}}
     └──         dynamic (%24)()::MemoryRef{Int64}                                                │
1159 5 ┄ %26 =   builtin Core.tuple(%9)::Tuple{Int64}                                             │
     │           builtin Base.setfield!(_2, :size, %26)::Tuple{Int64}                             │
1160 └──       return nothing                                                                     │
1122 6 ─ %29 =   dynamic Base.ArgumentError("grow requires delta >= 0")::Core.Const(ArgumentError("grow requires delta >= 0"))
     │           builtin Base.throw(%29)::Union{}                                                 │
     └──       unreachable                                                                        │
      => Nothing

Ok. Now we see some functions that are mutating! This one:

     │           builtin Base.setfield!(_2, :size, %26)::Tuple{Int64}                             │

If we look at our registered list of functions: BorrowChecker.jl/src/auto/defs.jl at d1c650c33f9c7d96a3385eae736b793c8f7f9079 · MilesCranmer/BorrowChecker.jl · GitHub, we can compare against these. Base.getproperty returns an alias to the object. And later on we see that it applies Base.setfield!, also in our registered list, which we declared as mutating!

That means we can infer that _growend! mutates its first argument. Therefore, _push! mutates its first argument. Therefore, push! mutates its first argument. Therefore, x is being mutated in the original function f!

However, if we look at the function:

julia> BorrowChecker.@auto function f()
           x = [1, 2, 3]
           y = x
           push!(x, 4)
           return y
       end

we will see that y is still “live” during the time we mutating x, and they refer to the same object. Therefore, that’s a BorrowCheckError!

That’s basically how it works. (Though liveness it also figures out at a lower level, not on the Julia syntax.)

Since it analyzes at the SSA-form IR stage, the library of registered mutating/aliasing functions is quite small, making our job easier.

9 Likes

They have their effects (mutating/aliasing/escaping) inferred all the way down. Functions are walked recursively and their effects are cached.

However, those functions are not themselves analyzed for ownership violations, as that would be too expensive. I turned this on originally but it blew up from doing borrow checker analysis on basically half of Base. Which we could certainly do… but maybe not by default.

At the moment I’m debating the right API for this, as it would be nice to only have to annotate the entrypoint and have your entire module checked. What do you think about something like this?

BorrowChecker.@auto recurse=false foo() = bar()  # default - only top-level checked
BorrowChecker.@auto recurse=true foo() = bar()   # only same-root module callees
BorrowChecker.@auto recurse=:all foo() = bar()   # recurse into everything (Base too)
4 Likes

Absolutely fantastic!

1 Like

Thank you for the explanation.

Wow is that a complete list of builtins? It’s short!

What happens in poorly inferred code if you encounter a function call with runtime dynamic dispatch? Do you assume that it takes ownership of all of its args? Or analyse all the possibly applicable methods?

Linking it all back to the original code, particularly the original variable names, seems tricky?

It would be nice if we could see an annotated version of the function code, annotated with information from this analysis (showing which vars alias which, which are mutable, etc).

This all feels like it would nicely complement JET analysis. If the two static analysis tools were combined I wonder if you could do more.

How about something like scope=:disabled, scope=:function (default), scope=:module (containing module), scope=:user (all non-stdlib), scope=:all?

2 Likes

This is something the user can configure but the default is to assume the conservative option, which is that it takes ownership.

1 Like

Working at the SSA-form IR level makes this much easier (and yes that’s the full set of registered functions!). It’s a similar story for the AD package Mooncake.jl which ends up being simpler in design than Enzyme.jl, since the former operates on SSA-form IR while the latter operates on the more complex LLVM IR.

Also would be interested in this. (PRs from anyone welcome!). I don’t know how JETLS.jl works but that seems like the right route for forwarding this information.

1 Like

Also the speed looks much better now, probably near the limits of what we can do until Julia lets us inject a custom compiler pass (or until BorrowChecker.@auto becomes part of Core.Compiler - which might make sense given EscapeAnalysis is already there)

julia> @btime f(1.0)
  23.008 ns (0 allocations: 0 bytes)
1.0
1 Like

Just for future reference. LLVM IR is a SSA like IR. The extra complexities come because it has to generate code that interacts with the Julia runtime (GC, dispatch etc) and that requires a lower level integration than mooncake, which can care a little less about those things.

3 Likes

If you see a significant difference in performance from living outside of the Compiler that’s probably a bug/(holding it wrong) because the compiler isn’t particularly special in this regard.

1 Like

Got it! What is the precise name for the SSA-form IR I’ve been referring to? Is it simply “Julia IR” or something?

1 Like