How to you call this julia feature?

I am using julia for 5+ years and today I learned about a mind-blowing feature which I wasn’t aware that it exists.

julia> function test_internal_state(st)
    function update_st()
        st = st + 1
        nothing
    end
    update_st()
    st
end
test_internal_state (generic function with 1 method)

julia> test_internal_state(1)
2

In Python this is impossible to write and will fail (with UnboundLocalError: local variable 'st' referenced before assignment). Hot stuff apparently.

How to you call this? So that I can google it and learn all the drawbacks about it :smiley:

update_st is a closure, capturing the variable st. I think the same thing should work in Python, too, it’s just that the shadowing rules are different so you’d have to do it in multiple statements.

With the current Julia implementation, such closures are not implemented as efficiently as might be expected (see Performance tips in the Manual), so when you apply some workarounds, like type annotations, the more efficient version of your code would end up longer, too.

Recently I was comparing different possible ways of implementing something like your example function, so here you go:

f0(n) =
  function()
    n += 1
    n
  end

f1(n) =
  let n = n
    function()
      n += 1
      n
    end
  end

f2(n) =
  let n::Int = n
    function()
      n += 1
      n
    end
  end

incremented(n) = n + one(typeof(n))

f3(n) =
  let n::Int = n
    function()
      n = incremented(n)
      n
    end
  end

f4(n) =
  let n::Int = n
    function()
      let m = incremented(n)
        n = m
        m
      end
    end
  end

f5(n) =
  let n::Int = n
    function()
      let m::Int = incremented(n)
        n = m
        m
      end
    end
  end

# Same as f5, but with overriden effects analysis.
f6(n) =
  let n::Int = n
    Base.@assume_effects :nothrow :terminates_locally function()
      let m::Int = incremented(n)
        n = m
        m
      end
    end
  end

f7(n) =
  let n::Int = n
    Base.@assume_effects :nothrow :terminates_globally function()
      let m::Int = incremented(n)
        n = m
        m
      end
    end
  end

The conclusion was that f0 and f1 are slow, and the other forms, from f2 on, should mostly be equally fast.

4 Likes

two questions about your examples:

  • it seems your example functions don’t return the closure state, but the output of the inner function, is this intended? can you explain?
  • it seems all function don’t run at all when called, but just return the inner function. Is this intended?

I guess the answer to both questions is that you run the inner function multiple times and see that they don’t fail and indeed update an inner state, which as such is the same as the one defined in the original function f1, f2, …, only that you cannot acccess it any longer from there

1 Like

Yeah, each function returns a closure. Each closure β€œremembers” where it left off with its state, so it doesn’t return the same result each time:

julia> f = f2(0)
#7 (generic function with 1 method)

julia> f()
1

julia> f()
2

julia> f()
3
1 Like

In Python this is written with the nonlocal declaration.

In [4]: def test_internal_state(st):
   ...:     def update_st():
   ...:         nonlocal st
   ...:         st = st + 1
   ...:     update_st()
   ...:     return st
   ...: 

In [5]: test_internal_state(1)
Out[5]: 2
2 Likes

I just tested a bit further, which version yields the best warntype and is the most performant.

It turns out, using a Ref is more efficient and even works without explicit type-annotation.

julia> test_closure = let n::Int=1
    function()
        n += 1
        n
    end
end
#1 (generic function with 1 method)

julia> @code_warntype test_closure()
MethodInstance for (::var"#1#2")()
from (::var"#1#2")() in Main at REPL[1]:2
Arguments
#self#::var"#1#2"
Locals
n@_2::Union{}
n@_3::Union{}
Body::Int64
1 ─ %1  = Core.getfield(#self#, :n)::Core.Box
β”‚   %2  = Core.isdefined(%1, :contents)::Bool
└──       goto #3 if not %2
2 ─       goto #4
3 ─       Core.NewvarNode(:(n@_2))
└──       n@_2
4 β”„ %7  = Core.getfield(%1, :contents)::Any
β”‚   %8  = Core.typeassert(%7, Main.Int)::Int64
β”‚   %9  = (%8 + 1)::Int64
β”‚   %10 = Core.getfield(#self#, :n)::Core.Box
β”‚   %11 = Base.convert(Main.Int, %9)::Int64
β”‚   %12 = Core.typeassert(%11, Main.Int)::Int64
β”‚         Core.setfield!(%10, :contents, %12)
β”‚   %14 = Core.getfield(#self#, :n)::Core.Box
β”‚   %15 = Core.isdefined(%14, :contents)::Bool
└──       goto #6 if not %15
5 ─       goto #7
6 ─       Core.NewvarNode(:(n@_3))
└──       n@_3
7 β”„ %20 = Core.getfield(%14, :contents)::Any
β”‚   %21 = Core.typeassert(%20, Main.Int)::Int64
└──       return %21


julia> test_closure2 = let n=Ref(1)
    function()
        n[] += 1
        n[]
    end
end
#3 (generic function with 1 method)

julia> @code_warntype test_closure2()
MethodInstance for (::var"#3#4"{Base.RefValue{Int64}})()
from (::var"#3#4")() in Main at REPL[3]:2
Arguments
#self#::var"#3#4"{Base.RefValue{Int64}}
Body::Int64
1 ─ %1 = Core.getfield(#self#, :n)::Base.RefValue{Int64}
β”‚   %2 = Base.getindex(%1)::Int64
β”‚   %3 = (%2 + 1)::Int64
β”‚   %4 = Core.getfield(#self#, :n)::Base.RefValue{Int64}
β”‚        Base.setindex!(%4, %3)
β”‚   %6 = Core.getfield(#self#, :n)::Base.RefValue{Int64}
β”‚   %7 = Base.getindex(%6)::Int64
└──      return %7

here the benchmarks, which show slight improvments

julia> using BenchmarkTools

julia> @benchmark test_closure()
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max):  18.787 ns … 969.627 ns  β”Š GC (min … max): 0.00% … 95.87%
Time  (median):     21.592 ns               β”Š GC (median):    0.00%
Time  (mean Β± Οƒ):   23.969 ns Β±  16.653 ns  β”Š GC (mean Β± Οƒ):  1.64% Β±  2.50%

β–ƒβ–„β–‡β–ˆβ–‡β–ƒβ– ▁▁▂▂▂▂▂▂▂▂▂▂▁▁                                     β–‚
β–„β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–‡β–‡β–‡β–‡β–‡β–†β–‡β–‡β–†β–†β–†β–†β–…β–…β–…β–…β–„β–…β–†β–„β–…β–…β–…β–„β–ƒβ–„β–„β–„β–„β–„β–… β–ˆ
18.8 ns       Histogram: log(frequency) by time      48.3 ns <

Memory estimate: 32 bytes, allocs estimate: 2.

julia> BenchmarkTools.@benchmark test_closure2()
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max):  12.057 ns … 738.533 ns  β”Š GC (min … max): 0.00% … 97.76%
Time  (median):     13.562 ns               β”Š GC (median):    0.00%
Time  (mean Β± Οƒ):   15.110 ns Β±  13.167 ns  β”Š GC (mean Β± Οƒ):  1.41% Β±  1.69%

β–‚β–…β–ˆβ–…β–β–β–„β–…β–„β–„β–ƒβ–                                                 β–‚
β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–‡β–†β–…β–†β–…β–„β–ƒβ–ƒβ–ƒβ–β–ƒβ–„β–„β–ƒβ–β–ƒβ–β–β–ƒβ–β–β–ƒβ–…β–„β–„β–„β–β–β–„β–β–„β–β–β–β–β–β–β–ƒβ–β–β–β–„ β–ˆ
12.1 ns       Histogram: log(frequency) by time      45.3 ns <

Memory estimate: 16 bytes, allocs estimate: 1.

It’s not so much that a Ref is special, but that the captured variable n is assigned only once prior to the closure’s instantiation and never reassigned afterward (the instance is mutated). At least, that seems to be the rule implied by the β€œPerformance of Captured Variables” sections in the performance tips. The type inference is incredibly brittle, even throwing a n = n line in the closure will ruin it.

The issue is that captured variables are implemented as fields of (immutable) structs, while the closure is implemented by a method of said struct; the explicit form of this is a functor. If a captured variable is assigned more than once prior to instantiation, the compiler bizarrely gives up on inferring the field even if the variable would be inferrable if not captured. If a captured variable is assigned after instantiation, including by the closure, then of course the field has to be something like a Ref to change the value. However, the compiler makes a Core.Box that knows nothing about the type of the variable, similar to a Ref{Any}. It makes some sense because the instantiation happens before the closure is called, so the type inference in the closure has not happened. Even worse, if a closure makes a captured variable uninferrable, it is uninferrable outside the closure too.

This is a known issue, Github issue #15276 in fact. The thread suggests a compiler improvement would need some non-trivial rewrite of the lowering process, so currently we cope with refactoring to something that’s not a closure (top-level methods, const global variables, callable objects), explicit type annotations of the captured variables, or inserting let blocks to accomplish the β€œassign just once beforehand” rule. Though I really don’t know if that improvement can fully accomplish the expected type inference, we’re expecting that variables shared between 2 methods to be inferred when only one has been called to create the other; it really does seem that we need explicit annotations like those in a callable object’s type.

1 Like