Function definition inside let-block impacting performance?

uniment · February 12, 2024, 4:05am

Sure, but in this context I was referring specifically to a boxed capture.

In order for a box to be type-stable, lowering must type-parameterize the box—therefore it needs to know all possible types that a could have during the closure’s lifetime, and therefore it needs the return types of all function calls which a is assigned to.

For example:

julia> f, g = let a = 0.5
           _f() = (a = a * 2; a)
           _g() = (a = a / 2; a)
           _f, _g
       end
(var"#_f#3"(Core.Box(0.5)), var"#_g#4"(Core.Box(0.5)))

Lowering can see all three syntactical assignments to a, although it has no idea how often the latter two may occur. If it knew that the return types of *(::Float64, ::Int) and /(::Float64, ::Int) are Float64 so the type of a is always Float64, it would be possible for lowering to type-parameterize a’s Box with Float64.

If lowering were rewritten in Julia, it would have access to this information through calls to Core.Compiler.return_type and typejoin, but since it’s still in Lisp my understanding is that it can’t. (I’ve previously shared ideas for it to emit Julia that will perform this function; maybe I should write a macro implementing those ideas instead of trying to convince people.)

Self-recursive closures, where their identifier is never reassigned, are as efficient as singletons since they don’t need to capture themselves; they can simply refer to themselves. It’s only for mutually recursive functions, which must otherwise capture each other, where it’s very attractive to check if they’re singleton.

True, but (sidenote) it can be improved by type-parameterizing Core.Box. Currently, declaring the type gets lowered to type-assertions whenever the boxed value is get or set, but I have shown that this is less efficient than type-parameterizing it.

Maybe we should take stock of the ideas we’ve discussed so far, as we’ve begun talking past each other a bit due to me floating several ideas simultaneously.

Idea 1: Locally Defined Singletons

If a locally defined function has no captures, then lowering can tell that it is singleton. (It doesn’t need to capture itself if its local identifier is never reassigned.) All references to its local identifier can be replaced with references to its global singleton.

Example:

# user code:
let;  fib(n) = n ≤ 1 ? n : fib(n-1)+fib(n-2)  end

# currently lowers to approximately:
struct var"#fib#3"  fib::Core.Box  end
(fib::var"#fib#3")(n) = n ≤ 1 ? n : fib.fib.contents(n-1)+fib.fib.contents(n-2)

# would instead lower to approximately:
struct var"#fib#3" end
(::var"#fib#3")(n) = n ≤ 1 ? n : var"#fib#3"()(n-1)+var"#fib#3"()(n-2)

Note that this idea helps for self-recursive singletons and mutually-recursive singletons, but not for self-recursive closures nor mutually-recursive closures.

Idea 2: Self-recursive closures

If a locally defined function references itself, and if its identifier is never reassigned, we can lower its function definition to refer to itself, and we can lower its struct definition not to capture itself.

Example:

# user code:
let;  fib(n) = n ≤ 1 ? n : fib(n-1)+fib(n-2)  end

# would instead lower to approximately:
struct var"#fib#3" end
(fib::var"#fib#3")(n) = n ≤ 1 ? n : fib(n-1)+fib(n-2)

Note that this idea helps for self-recursive singletons and and self-recursive closures, but not for mutually-recursive singletons nor mutually-recursive closures.

Idea 3: Mutually-recursive closures

If two (or more) locally defined closures are mutually recursive, then they must capture each other in a type-stable manner.

I showed how this could be type-stable and reasonably performant, for example:

# user code:
let
    iseven(n) = n == 0 ? true  : isodd(n-1)
    isodd(n)  = n == 0 ? false : iseven(n-1)
end

# would lower to approximately:
struct var"#iseven#4"{T}  isodd::Box{T}  end
struct var"#isodd#5"      iseven::var"#iseven#4"{var"#isodd#5"}  end
(iseven::var"#iseven#4")(n) = n == 0 ? true  : iseven.isodd.contents(n-1)
(isodd::var"#isodd#5")(n)   = n == 0 ? false : isodd.iseven(n-1)

However, now that I think about it again, it only works for functions that could have been singleton—functions that capture nothing but each other. For example, if isodd had type parameters because it captured additional variables, it wouldn’t be possible for iseven’s type parameter to describe it.

So it looks like we run into a hard limit: it is currently impossible to make performant mutually recursive closures. I have opened an issue to probe if it’s possible to make circularly-parameterized types to address this.

With more creative ordering, we can make closures capture each other alongside other values:

Example:

struct IsEven{X,IO}  
    x::X
    isodd::IO
end
mutable struct IsOdd{X,Y}
    const y::Y
    iseven::IsEven{X,IsOdd{X,Y}} 
    IsOdd{X,Y}(x) where {X,Y} = new{X,Y}(x)
end
(iseven::IsEven)(n::Int) = n == iseven.x ? true  : iseven.isodd(n-1)
(isodd ::IsOdd )(n::Int) = n == isodd.y  ? false : isodd.iseven(n-1)

x, y = 0, 0
X, Y = typeof.((x, y))
isodd = IsOdd{X,Y}(0)
iseven = IsEven(0, isodd)
isodd.iseven = iseven

performance:

julia> @btime $iseven(20) 
  59.063 ns (0 allocations: 0 bytes)
true

This is … different … because the closures aren’t actually capturing each other, instead they are continually reinstantiating each other from their own internal state. If it is decided that all the mutually-recursive closures must have the same fields, then I guess this is okay.

Compared to having the mutually-recursive closures capture each other, having them re-instantiate each other is faster:

julia> @btime $is_even(20)  # Benny's idea
  20.441 ns (0 allocations: 0 bytes)
true

For the sake of memory, perhaps it’s best for mutually-recursive closures to make a single struct which captures all mutually-recursive closures’ fields, and to have all the functions capture that one struct.

struct IsevenIsoddStorage{A,B}  a::A; b::B  end
struct Iseven{S}<:Function;  store::S  end
struct Isodd{S}<:Function;   store::S  end
(is_even::Iseven)(n::Int) = n == is_even.store.a ? true :  Isodd(is_even.store)(n - 1)
(is_odd::Isodd)(n::Int)   = n == is_odd.store.b  ? false : Iseven(is_odd.store)(n - 1)

store = IsevenIsoddStorage(0,0)
is_even = Iseven(store)
is_odd = Isodd(store)
is_even(8), is_odd(8)

Note that this idea would help for mutually-recursive closures and mutually-recursive singletons.

Topic		Replies	Views
Performance comparison, Julia is a slower than Mojo, am I doing something wrong? Performance question , mojo	44	8007	September 28, 2023
Significant difference in execution speed between similar functions with nested loops Performance question	4	247	March 8, 2023
Huge difference in performance between two programmes Performance	7	507	April 27, 2021
Are let blocks as fast as functions? Performance	11	1546	June 29, 2020
10x slowdown when passing function as argument Performance	15	2270	June 4, 2020

Function definition inside let-block impacting performance?

Related topics