Blacklisting a type from being captured in functions

In support of BorrowChecker.jl I am trying to figure out if there’s a way I can throw an error if an instance of a particular type gets captured in a function.

For example, if I have a type

struct NoCapture{T}
    x::T
end

and then write

var = NoCapture(Ref(1))

is there a way I can prevent the following?

f() = var.x[]

Basically I’m wondering if there’s any hook I can specify in the Julia internals to prevent a particular type from being captured in a function.

The aim is to support the goals of BorrowChecker.jl. Preventing function capturing on some of the package’s types would help improve memory safety.

No, variable capturing is a syntactic transform that happens long before any types of variables are known.

2 Likes

What about via Cassette.jl? Is there some internal hook that gets called on captured variables? Then I could just override that for my types.

Or maybe would EscapeAnalysis · The Julia Language have some metadata that would mark a variable as “captured from outside the scope”?

That’s impossible, variable capturing is not distinct from local variables being present in nested local scopes. You can’t opt in or out of that depending on the type of the assigned instance at runtime, it’s either always capture or always a new local. This happens at parse-time, the compiler is too late to do anything about it. You can at least assert the captured variable’s instance to not be a type, error at runtime if it is. If the compiler can infer the captured variable will always be that type, you could make a static warning.

I don’t think this is true?

Example:

function f(x)
    (() -> (x = x + 1; nothing))()
    return x
end

This is type unstable:

julia> @code_warntype f(1)
MethodInstance for f(::Int64)
  from f(x) @ Main REPL[1]:1
Arguments
  #self#::Core.Const(Main.f)
  x@_2::Int64
Locals
  #1::var"#1#2"
  x@_4::Union{}
  x@_5::Union{Int64, Core.Box}
Body::Any
1 ─       (x@_5 = x@_2)
│   %2  = x@_5::Int64
│         (x@_5 = Core.Box(%2))
│   %4  = Main.:(var"#1#2")::Core.Const(var"#1#2")
│   %5  = x@_5::Core.Box
│         (#1 = %new(%4, %5))
│   %7  = #1::var"#1#2"
│         (%7)()
│   %9  = x@_5::Core.Box
│   %10 = Core.isdefined(%9, :contents)::Bool
└──       goto #3 if not %10
2 ─       goto #4
3 ─       Core.NewvarNode(:(x@_4))
└──       x@_4
4 ┄ %15 = x@_5::Core.Box
│   %16 = Core.getfield(%15, :contents)::Any
└──       return %16

whereas code without the closure:

julia> function f(x)
           x = x + 1
           return x
       end
f (generic function with 1 method)

is perfectly stable:

julia> @code_warntype f(1)
MethodInstance for f(::Int64)
  from f(x) @ Main REPL[4]:1
Arguments
  #self#::Core.Const(Main.f)
  x@_2::Int64
Locals
  x@_3::Int64
Body::Int64
1 ─      (x@_3 = x@_2)
│   %2 = x@_3::Int64
│        (x@_3 = %2 + 1)
│   %4 = x@_3::Int64
└──      return %4

So clearly the compiler is doing something differently here. I just want to latch onto that internal detail. (At this point I just want a proof of concept, I don’t care how hacky it is)

Is there a way I can forcefully edit methods in Core?

Looking at the lowered code, here:
julia> function f()
           @bind x = 1
           # x::Bound
           function g()
               x = x + 1
           end
           g()
           return x
       end
f (generic function with 1 method)

julia> @code_warntype f()
MethodInstance for f()
  from f() @ Main REPL[13]:1
Arguments
  #self#::Core.Const(Main.f)
Locals
  g::var"#g#3"
  x@_3::Core.Box
  x@_4::Union{}
Body::Any
1 ─       (x@_3 = Core.Box())
│   %2  = Main.Val(false)::Core.Const(Val{false}())
│   %3  = (BorrowChecker.SemanticsModule.bind)(1, $(QuoteNode(1)), :x, %2)::Core.PartialStruct(Bound{Int64}, Any[Core.Const(1), Bool, Int64, Core.Const(:x)])
│   %4  = x@_3::Core.Box
│         Core.setfield!(%4, :contents, %3)
│   %6  = Main.:(var"#g#3")::Core.Const(var"#g#3")
│   %7  = x@_3::Core.Box
│         (g = %new(%6, %7))
│   %9  = g::var"#g#3"
│         (%9)()
│   %11 = x@_3::Core.Box
│   %12 = Core.isdefined(%11, :contents)::Bool
└──       goto #3 if not %12
2 ─       goto #4
3 ─       Core.NewvarNode(:(x@_4))
└──       x@_4
4 ┄ %17 = x@_3::Core.Box
│   %18 = Core.getfield(%17, :contents)::Any
└──       return %18

I feel like I might be able to actually pull this off via editing this method of Core.Box:

julia> Core.setfield!(::Core.Box, ::Symbol, ::Bound) = error("Not allowed!")
ERROR: cannot add methods to a builtin function

It looks like the Julia IR is using this method to store the variable inside the closure.

Like maybe I could do this via Cassette.jl or something?

:mega: WHERE THERE’S A WILL, THERE’S A WAY!!

Check this out. BorrowChecker.jl now uses Cassette.jl to overdub Core.setfield!(::Core.Box, :contents, ...) so that if you try to capture any Bound variable in a closure, an error will be thrown:

julia> using BorrowChecker

julia> function f()
           @bind x = 1
           function g()
               x = x + 1
           end
           g()
           return x
       end
f (generic function with 1 method)

julia> BorrowChecker.@managed begin
           f()
       end
ERROR: You are not allowed to capture bound variable `x` inside a closure.
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35

Works like this:

function Cassette.overdub(ctx::ManagedCtx, f, args...)
    if f == Core.setfield! &&
        length(args) == 3 &&
        args[1] isa Core.Box &&
        args[2] == :contents &&
        args[3] isa Union{Bound,BoundMut}

        symbol = args[3].symbol
        error("You are not allowed to capture bound variable `$(symbol)` inside a closure.")
    end
    #= Other overdubbing =#
end

(And before anybody says “this is hacky, don’t do this, etc.”, yes, I know. BorrowChecker.jl is not going to be standard Julia code. That’s ok though, because it’s just a debugging tool!)

3 Likes

Judging from your example, I don’t think you understood. I meant variable capturing:

let x = 1
   function f()
      x
   end
end

is just a local variable showing up in a nested local scope, like this:

let x = 1
  let
    x
  end
end

Your solution still captures the variable x, it just errors if the instance has the forbidden types at runtime; I think the borrow-checking types and certain error should still be inferred at compile-time of f() even with poor type inference of x. I’d say the sweeping prohibition is justified, the borrow-checking types are not intended to be used like normal types. Potential problem: if you don’t reassign the variable in g, it won’t be a Core.Box, so would that let the borrow-checking types through?

That’s determined at parse-time with Core.Box, the same way struct X x::Any end forces the field to be type-unstable before any method compilation ever sees the type.

Yeah this is all I wanted, sorry if it didn’t make sense initially.

Regarding this edit:

Good point. Hm…

Edit: I guess it avoids the worst type of error though, so probably good for a start?

Better than nothing, but the most direct way of stopping a variable capture would be to use a macro around the function f() ... end to interfere with its parsing, no need to run or compile g. Reproducing Julia’s variable scoping rules to find forbidden captures of @bind-flagged variables sounds way harder though.

Yeah seems tricky for sure. One good note is that BorrowChecker.jl by itself already seems to flag captured variables correctly from the existing tracking mechanism.

I’m implementing the borrow checker in SymbolicRegression.jl at the moment: Comparing master...borrow-checker · MilesCranmer/SymbolicRegression.jl · GitHub which results in code that sorta looks like this:

@bind init_hall_of_fame = load_saved_hall_of_fame(@take(saved_state))
@lifetime a begin
    @ref a rdatasets = datasets
    @ref a roptions = options
    @bind for j in 1:nout
        @bind :mut hof = strip_metadata(
            @take(init_hall_of_fame[j]), roptions, rdatasets[j]
        )
        @lifetime b begin
            @ref b :mut for member in hof.members[hof.exists]
                @bind score, result_loss = score_func(
                    rdatasets[j], @take(member), roptions
                )
                member.score = @take!(score)
                member.loss = @take!(result_loss)
            end
        end
        state.halls_of_fame[j] = @take(hof)
    end
end

where functions that are “borrow compatible” are modified like so:

- is_weighted(dataset::Dataset) = !isnothing(dataset.weights)
+ is_weighted(dataset::Borrowed{Dataset}) = !isnothing(dataset.weights)

which is basically equivalent to dataset: &Dataset in Rust, which prevents dataset from being modified in that function, but at the same time, let’s you reference the same data from multiple threads.

In doing this, I was pleasantly surprised by an error coming from an accidental capture in a closure! I got the error

Cannot use x: value's lifetime has expired

which turned out to be from something like this:

@bind :mut tasks = Task[]
@lifetime a begin
    for i in 1:10
        @ref a x = data
        push!(tasks, Threads.@spawn f(x))  # bad!
    end
end

where x was borrowed. This means after the @lifetime scope ended, the access to x is no longer valid. And it turns out I was doing naughty referencing stuff like this without realising it, and BorrowChecker.jl correctly flagged it! :smiley:

4 Likes