Calling f(a) = f(a) correctly gives StackOverflowError but not f(a...) = f(a)

I start with this:

VERSION == v"1.9.0"
f(a::Int64) = nothing

Now, If I define this method and call f(1.), I get a StackOverflowError. Makes sense.:

f(a) = f(a)
@code_lowered f(1.)
# CodeInfo(
# 1 ─ %1 = Main.f(a)
# └──      return %1
# )
f(1.) # gives StackOverflowError. Good

But instead, if I define this one and call f(1.), I don’t get the error. It seems to go into an infinite loop. Shouldn’t I get an error? The lowered code is exactly the same.

f(a...) = f(a)
@code_lowered f(1.)
# CodeInfo(
# 1 ─ %1 = Main.f(a)
# └──      return %1
# )
f(1.) # does not give StackOverflowError. Probably should?
1 Like

Note there is a difference. The latter definition packages the argument in a tuple.

2 Likes

Definitely doing something different, though I don’t understand why a StackOverflowError is never reached just from this. @julia_f_130 calling @julia_f_130 seems recursive enough, but @julia_g_150 is calling @j_g_152, I don’t know what’s up with that.

julia> f(a) = f(a);

julia> g(a...) = g(a);

julia> @code_llvm f(1.)
;  @ REPL[1]:1 within `f`
; Function Attrs: noreturn
define nonnull {}* @julia_f_130(double %0) #0 {
top:
  %1 = call nonnull {}* @julia_f_130(double %0) #0
  unreachable
}

julia> @code_llvm g(1.)
;  @ REPL[2]:1 within `g`
; Function Attrs: noreturn
define nonnull {}* @julia_g_150(double %0) #0 {
top:
  %a = alloca [1 x double], align 8
  %1 = getelementptr inbounds [1 x double], [1 x double]* %a, i64 0, i64 0
  store double %0, double* %1, align 8
  %2 = call nonnull {}* @j_g_152([1 x double]* nocapture readonly %a) #3
  call void @llvm.trap()
  unreachable
}

julia> methods(f)[1].specializations
svec(MethodInstance for f(::Float64), nothing, nothing, nothing, nothing, nothing, nothing, nothing)

julia> methods(g)[1].specializations # observed tuple-izing to a point
svec(MethodInstance for g(::Float64), MethodInstance for g(::Tuple{Float64}), MethodInstance for g(::Tuple{Tuple{Float64}}), MethodInstance for g(::Tuple{Any}), nothing, nothing, nothing, nothing)

It probably would stack overflow “eventually”, but it’d take a very very very long time and your computer might crash before it happens. The reason is that when you do the second one, the signature getting called is

f(::Float64)
f(::Tuple{Float64})
f(::Tuple{Tuple{Float64}})
f(::Tuple{Tuple{Tuple{Float64}}})
f(::Tuple{Tuple{Tuple{Tuple{Float64}}}})
f(::Tuple{Tuple{Tuple{Tuple{Tuple{Float64}}}}})
f(::Tuple{Tuple{Tuple{Tuple{Tuple{Tuple{Float64}}}}}})
f(::Tuple{Tuple{Tuple{Tuple{Tuple{Tuple{Tuple{Float64}}}}}}})
...

This means you’re compiling new methods, and creating new types at each step of the recursive loop. This just totally tortures the compiler and takesforever

4 Likes

I understand that it is falling into infinite recursion. I also understand that in f(a...) = f(a), the a in the right hand side is a Tuple. My concern is that it is not throwing an error.

As i said. It will eventually, but it’ll take a long time to get there, because the compiler gets called at every step.

1 Like

@Mason, your explanation of what’s happening in the case of f(a...) = f(a) makes sense. And I also get that it will error/give up eventually.

I guess with f(a) = f(a), the same f(::Float64) method gets called multiple times, and it’s easy to catch when that happens because it’s the same method. In the case of f(a) = f(a...), it is not the same method getting called multiple times, it’s a new method each time.

I suppose, unlike the first case, there is no good way to figure out early on that the second case is going into an infinite recursion?

1 Like

Oh I see, I think the confusion here is about how the error is reported. When you do f(x) = f(x), the recursion is so fast that you hit the stack overflow error immediately. This was not an error that was predicted by the compiler and thrown at compile time, it’s an error that was encountered at runtime when the call stack physically ran out of space.

Julia has something called a “call stack”. When you enter a function, the address of that function gets placed on the stack. When you exit the function, it gets removed.

julia> f(1)
ERROR: StackOverflowError:
Stacktrace:
 [1] f(x::Int64) (repeats 79984 times)
   @ Main ./REPL[27]:1

Notice the (repeats 79984 times) there. The callstack has enough space e.g. on my machine for 79984 recursive function calls before a stack overflow. calling f(x) = f(x) only takes a few nanoseconds, so the error is hit before I can even notice a delay.

But doing 79984 recursive calls where at each step, the compiler has to do even more work to create the type signature will take forever.

3 Likes

I’m on Julia 1.9.0, and I hit the error after 2 repeats for f(a) = f(a) (let’s call the other function g as per what @Benny wrote.) Which is why I thought there was a mechanism in place to catch infinite recursions early on.

ERROR: StackOverflowError:
Stacktrace:
 [1] f(a::Float64) (repeats 2 times)

Here’s a fun one.

julia> f(@nospecialize(a...)) = f(a)
f (generic function with 1 method)

julia> f(1.0)
Segmentation fault

There. It failed early. Happy?

2 Likes

that’s bizarre. I’d bet it’s a bug in the stacktrace printing or handling

1 Like

I would be happier if I understood what’s going on. I also get a StackOverflowError with what you suggest, not a Segmentation fault error.

I wish there was a way to specify my level of technical knowledge, maybe a badge on my profile icon or something, that communicates my level of technical expertise. So when people try to help me, they would know how much they need to dumb things down. I’m not just “New to Julia”, but “New To Computer Programming” as well.

In any case, I think enough smart people have looked at this issue and are not worried. So I’m not worried. And I learned a bit about what the two definitions are doing differently. Thanks.

To be clear. This is worrying. It should never happen if I’m not doing something unsafe.

I think it’s not particularly surprising. When you add the @nospecialize, it means julia won’t spend time compiling new method specializations, which speeds up the loop. This means you can get to larger nested tuple types, which are torture for the compiler and runtime. Eventually you create a type so big it crashes the runtime or compiler.

2 Likes

I’m not surprised, but I also don’t think a segmentation fault is the correct failure mode here. If the compiler cannot allocate more memory it should be able fail more gracefully.

Are you on MacOS? there’s various (unfortunately closed) issues in the Julia issue tracker (like Segmentation fault where stack overflow expected · Issue #21072 · JuliaLang/julia · GitHub) about getting segfaults instead of stack overflows in some situations on MacOS.

2 Likes

I’m surprised, I thought Varargs weren’t specialized automatically

Yes, I’m on a mac.