Compiler recursion limit

CameronBieganek · April 8, 2024, 9:13pm

I’ve written some recursive code that compiles away due to (I believe) constant propagation. However, I’m afraid that if I run the code on a larger problem I will hit a compiler recursion limit and it will no longer compile away. Does anyone have a rule of thumb for how deep your code can recurse before compiler inference gives up and infers Any?

I know, I know, you probably want to see a MWE. I do have one, but it’s in a repo that is currently private, so I’d prefer not to share it at this time. If there is no precise number, like a JULIA_INFERENCE_RECURSION_LIMIT, a rule of thumb will do.

gdalle · April 9, 2024, 6:45am

Maybe a very simple test case can help answer?

julia> using Test

julia> f(::Val{0}) = 0
f (generic function with 1 method)

julia> f(::Val{N}) where {N} = f(Val{N-1}())
f (generic function with 2 methods)

julia> @inferred f(Val(10))
0

julia> @inferred f(Val(100))
0

julia> @inferred f(Val(1000))
Internal error: stack overflow in type inference of f(Base.Val{1000}).
This might be caused by recursion over very long tuples or argument lists.
0

However I assume the compiler behavior is more complex, and typically depends on the actual function involved.

CameronBieganek · April 9, 2024, 12:16pm

Ok, here’s my example. I wrote a function topo_sort that does a topological sort on a tuple of singleton types. There is a DAG relationship among the types, which is defined by overloads of a children function.

Here’s the implementation:

function in_tuple(x::S, t::Tuple{T, Vararg}) where {S, T}
    S === T || in_tuple(x, Base.tail(t))
end

in_tuple(::T, t::Tuple{T, Vararg}) where {T} = true
in_tuple(_, t::Tuple{}) = false


function visit_node(order::Tuple, x)
    in_tuple(x, order) && return order
    new_order = visit_children(order, children(x))
    (x, new_order...)
end

function visit_children(order::Tuple, children::Tuple)
    new_order = visit_node(order, children[1])
    visit_children(new_order, Base.tail(children))
end

visit_children(order::Tuple, ::Tuple{}) = order


topo_sort(to_visit::Tuple) = _topo_sort((), to_visit)

function _topo_sort(order::Tuple, to_visit::Tuple)
    new_order = visit_node(order, to_visit[1])
    _topo_sort(new_order, Base.tail(to_visit))
end

_topo_sort(order::Tuple, unvisited::Tuple{}) = order

I’ve come up with a fairly simple example DAG that sometimes compiles away and sometimes does not, depending on the order that I provide the singleton types to the topo_sort function. Here’s the DAG:

struct A end
struct B end
struct C end
struct D end
struct E end
struct F end
struct G end

children(::A) = (B(), C())
children(::B) = (D(), )
children(::C) = (D(), )
children(::D) = (E(), )
children(::E) = (F(), )
children(::F) = (G(), )
children(::G) = ()

And here we can see that sometimes topo_sort compiles away, and sometimes it does not:

julia> using Test 

julia> @code_typed topo_sort((A(), B(), C(), D(), E(), F(), G()))
CodeInfo(
1 ─     return (A(), C(), B(), D(), E(), F(), G())
) => Tuple{A, C, B, D, E, F, G}

julia> @inferred topo_sort((A(), B(), C(), D(), E(), F(), G()))
(A(), C(), B(), D(), E(), F(), G())

julia> @code_typed topo_sort((C(), F(), B(), A(), D(), E(), G()))
CodeInfo(
1 ─ %1 = invoke Main._topo_sort((C(), D(), E(), F(), G())::Tuple{C, D, E, F, G}, (B(), A(), D(), E(), G())::Tuple{B, A, D, E, G})::Tuple{Any, Vararg{Any}}
└──      return %1
) => Tuple{Any, Vararg{Any}}

julia> @inferred topo_sort((C(), F(), B(), A(), D(), E(), G()))
ERROR: return type Tuple{A, B, C, D, E, F, G} does not match inferred return type Tuple{Any, Vararg{Any}}

Is there any way my implementation can be improved so that it is more likely to constant propagate in a wider variety of situations?

matthias314 · April 9, 2024, 1:05pm

I don’t know if this is acceptable in your situation, but if you use @assume_effects,

Base.@assume_effects :total topo_sort(to_visit::Tuple) = _topo_sort((), to_visit)

then inference works:

julia> @inferred topo_sort((C(), F(), B(), A(), D(), E(), G()))
(A(), B(), C(), D(), E(), F(), G())

mbauman · April 9, 2024, 1:14pm

For recursive tuple code like this, my rule of thumb is that the arguments should be decreasing in complexity on each recursive step. In other words, the red flag here is that (I think) your new_order is getting longer, whereas you want things that get shorter (like children).

I’m not sure how you’d refactor this to achieve that, unfortunately.

mnemnion · April 9, 2024, 5:07pm

The compiler won’t stop inferring recursive types, and the inference process appears to be linear to the depth of the type currently. Which would make complete inference quadratic to the depth of the actual recursive structure: you don’t want this. You would end up with a complete copy of the tree structure inhabiting the type system, very slowly, and to no benefit.

The available solutions depend on the details of the type, but revolve around partially-parameterizing at some point in the process, or insisting on Any.

CameronBieganek · April 10, 2024, 1:52am

Here’s an interesting related example. Suppose we already have a tuple order that contains a topological sort of singleton structs, and we have a smaller tuple xs with a subset of the full topological sort (with no duplicates). However, the elements in xs are not topologically sorted. What we would like to do is return a new tuple where the elements of xs are topologically sorted. We can do this with the following recursive code.

Implementation:

function in_tuple(x::S, t::Tuple{T, Vararg}) where {S, T}
    S === T || in_tuple(x, Base.tail(t))
end

in_tuple(::T, t::Tuple{T, Vararg}) where {T} = true
in_tuple(_, t::Tuple{}) = false

Base.@assume_effects :total function tuple_topo_sort(order, match, xs)
    if length(match) == length(xs)
        return match
    end

    first = order[1]

    if in_tuple(first, xs)
        tuple_topo_sort(Base.tail(order), (match..., first), xs)
    else
        tuple_topo_sort(Base.tail(order), match, xs)
    end
end

Test case:

using Test
using BenchmarkTools

for i in 1:100
    sym = Symbol("A$i")
    @eval struct $sym end
end

order_vec = map(1:100) do i
    sym = Symbol("A$i")
    eval(:($sym()))
end

order = tuple(order_vec...)

The tuple_topo_sort function is inferred for this example:

julia> @inferred tuple_topo_sort(order, (), (A75(), A50(), A25()))
(A25(), A50(), A75())

However, it does not completely compile away:

julia> @code_typed tuple_topo_sort(order, (), (A75(), A50(), A25()))
CodeInfo(
1 ─     nothing::Nothing
│       nothing::Nothing
│       Core._apply_iterate(Base.iterate, Base.argtail, order)::Tuple{A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, A23, A24, A25, A26, A27, A28, A29, A30, A31, A32, A33, A34, A35, A36, A37, A38, A39, A40, A41, A42, A43, A44, A45, A46, A47, A48, A49, A50, A51, A52, A53, A54, A55, A56, A57, A58, A59, A60, A61, A62, A63, A64, A65, A66, A67, A68, A69, A70, A71, A72, A73, A74, A75, A76, A77, A78, A79, A80, A81, A82, A83, A84, A85, A86, A87, A88, A89, A90, A91, A92, A93, A94, A95, A96, A97, A98, A99, A100}
└──     return (A25(), A50(), A75())
) => Tuple{A25, A50, A75}

We can see a Core._apply_iterate in there, and we can see its affect on the runtime of tuple_topo_sort:

julia> @btime tuple_topo_sort(order, (), (A75(), A50(), A25()));
  3.030 μs (0 allocations: 0 bytes)

Any ideas on how to get this case to completely compile away?

nsajko · April 10, 2024, 8:39am

I may look more deeply into this later, but I suggest you try what I did in TupleSorting.jl. In src/TupleUtils.jl there’s a short tup_tail method which is a replacement for Base.tail based on ntuple instead of on argtail.

There’s also a PR: add a `Base.tail` method for nonempty tuples by nsajko · Pull Request #52035 · JuliaLang/julia · GitHub

You might also be interested to see the implementation of ntuple(::Any, ::Val), which is basically this (look in base/ntuple.jl in the Julia source), I slightly simplified by removing the @generated branch:

function ntuple(f::F, ::Val{N}) where {F,N}
    N::Int
    (N >= 0) || throw(ArgumentError(LazyString("tuple length should be ≥ 0, got ", N)))
    Tuple(f(i) for i = 1:N)
end

Side note: it’s a bit annoying seeing all the trade offs between compilation cost, sysimage size, etc. on one side and runtime performance on the other that Julia Base and the Julia ecosystem have to make. There should be a way to somehow allow users to opt into maximum runtime performance while still using largely the same code paths.

CameronBieganek · April 10, 2024, 12:24pm

Awesome, that worked! Thanks, @nsajko. Here’s the complete script:

Code

using Test
using BenchmarkTools

function tail(x::Tuple{Any, Vararg{Any, N}}) where {N}
    f = let x = x
        i -> x[i + 1]
    end
    ntuple(f, Val(N))::NTuple{N, Any}
end

function in_tuple(x::S, t::Tuple{T, Vararg}) where {S, T}
    S === T || in_tuple(x, tail(t))
end

in_tuple(::T, t::Tuple{T, Vararg}) where {T} = true
in_tuple(_, t::Tuple{}) = false

Base.@assume_effects :total function tuple_topo_sort(order, match, xs)
    if length(match) == length(xs)
        return match
    end

    first = order[1]

    if in_tuple(first, xs)
        tuple_topo_sort(tail(order), (match..., first), xs)
    else
        tuple_topo_sort(tail(order), match, xs)
    end
end

for i in 1:100
    sym = Symbol("A$i")
    @eval struct $sym end
end

order_vec = map(1:100) do i
    sym = Symbol("A$i")
    eval(:($sym()))
end

order = tuple(order_vec...)
xs = (A75(), A50(), A25())

Testing it in the REPL:

julia> @inferred tuple_topo_sort(order, (), xs)
(A25(), A50(), A75())

julia> @code_typed tuple_topo_sort(order, (), xs)
CodeInfo(
1 ─     nothing::Nothing
│       nothing::Nothing
└──     return (A25(), A50(), A75())
) => Tuple{A25, A50, A75}

julia> @btime tuple_topo_sort($order, (), $xs);
  1.432 ns (0 allocations: 0 bytes)

julia> bar(::Tuple{A25, A50, A75}) = 42;

julia> foo(order, xs) = bar(tuple_topo_sort(order, (), xs));

julia> @code_typed foo(order, xs)
CodeInfo(
1 ─     return 42
) => Int64

julia> @btime foo($order, $xs);
  1.432 ns (0 allocations: 0 bytes)

(Regarding the original topological sort, I realized that for my problem I can just keep an order tuple around and when new structs are added to the DAG I can insert them into the correct location in order, and that operation is not performance sensitive.)

Palli · April 17, 2024, 11:40pm

Any recursion will blow the stack in the end. Is that how you know a segfauly or something? You can change it with ulimit -s (not sure how on Windows; from within a process with setrlimit and maybe it’s cross-platform).

I didn’t know Python had this limitation/and this solution:

Increasing the recursion limit: Python has a default maximum recursion depth of 1000. If a function exceeds this limit, it can be increased using the sys.setrecursionlimit(n) function.

I doubt Julia has any specific limitations except what the hardware, or rather OS gives you. At least not in general, but maybe for compilation specifically (but why would it?).

Interesting. Julia scales badly on this benchmark (because of constant propagation if I recall) from my friend Per: GitHub - nordlow/compiler-benchmark: Benchmarks compilation speeds of different combinations of languages and compilers.

The code and benchmark there is highly unusual, just to exercise the compiler, not for real-world code, so I don’t (and others) do not care enough. I believe the issue would go away on -O0 if it really meant no optimizations, as for other languages. I understand constant propagation is the only (or one of few) done on -O0.

CameronBieganek · April 17, 2024, 11:52pm

The Julia compiler does eventually give up on inference for recursive and mutually recursive functions because otherwise the compile times could blow up. However, as @matthias314 pointed out, careful use of Base.@assume_effects can convince the compiler to keep inferring. I haven’t played around with the different settings yet, but my hypothesis is that the :terminates_globally setting by itself might be enough to get the compiler to completely infer a recursive function.

Topic		Replies	Views
Why doesn't the compiler infer the type of this function? General Usage inference	6	1169	February 25, 2020
Regression in v0.7 due to weird inference failure for function with Lisp-y recursion Internals & Design inference	14	1627	March 23, 2018
Julia never stops compiling for new recursive Vector types Internals & Design performance , recursion , quadratic	18	696	February 15, 2024
Failure to optimize due to type recursion General Usage question , inference	4	384	January 9, 2024
Make Julia complete the inference of some recursive code Performance question , performance , inference , type-stability , recursion	2	384	March 11, 2024

Compiler recursion limit

Related topics