# Regression in v0.7 due to weird inference failure for function with Lisp-y recursion

Consider this example, where `foo` works out a type given some inputs of other types:

``````mutable struct A{E, L}
end

mutable struct B{E, L}
end

mutable struct C{E, L}
end

foo(::Type{T}, t, ts...) where {T} = foo(foo(T, t), ts...)
foo(::Type{A{E, L}}, ::B{E2,L2}) where {E,E2,L,L2} = A{E2,L}
foo(::Type{A{E, L}}, ::C{E2,L2}) where {E,E2,L,L2} = A{E2,L2}
``````

Let’s define some input objects of types `B` and `C`

``````b, c = B{Float64,2}(), C{Float64,2}()
``````

Under current master the return type of the following call fails to be inferred and is very slow

``````julia> @btime foo(A{Float64,0}, \$b, \$c, \$b, \$b)
258.510 ns (2 allocations: 32 bytes)
A{Float64,2}
``````

However, a minuscule change to `foo` makes is completely inferrable

``````foo(::Type{A{E, L}}, ::C{E2,L2}) where {E,E2,L,L2} = A{E2,L}
``````

which produces the following timings

``````julia> @btime foo(A{Float64,0}, \$b, \$c, \$b, \$b)
0.023 ns (0 allocations: 0 bytes)
A{Float64,0}
``````

I’ve been trying to figure out the reason all day today, and I cannot fathom the logic here. Something like this is killing the performance of my code. This problem arises in current master, but not in v0.6.1, where everything is nicely inferred and elided

``````julia> versioninfo()
Julia Version 0.7.0-DEV.4390
Commit 79c7bdd9ec (2018-02-26 07:59 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin17.4.0)
CPU: Intel(R) Core(TM) i7-5775R CPU @ 3.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
Environment:
``````

Is this known? Should I open an issue for this?
[EDIT: corrected typos]

I fail to see the difference between the two `foo` versions. Also you don’t define `a`.

Oops, apologies @mauro3. Two typos corrected!

1 Like

By the way, the `mutable` part is completely irrelevant. The regression also happens with immutable `struct`s.

Interesting. I can reproduce.
Also worth pointing out that just like changing `foo` to treat `B` and `C` the same causes type stability, replacing the `c` in the function call with `b` is also type stable with the original foo.

``````julia> @code_warntype foo(A{Float64,0}, b, c, b, b)
Variables:
t<optimized out>
ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}, 1)::C{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}, 2)::B{Float64,2}
Core.SSAValue(5) = (Core.getfield)(ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}, 3)::B{Float64,2}
# meta: location REPL[4] foo 1
Core.SSAValue(9) = \$(Expr(:invoke, MethodInstance for foo(::Type{A{Float64,2}}, ::B{Float64,2}, ::B{Float64,2}, ::Vararg{B{Float64,2},N} where N), :(Main.foo), A{Float64,2}, Core.SSAValue(4), Core.SSAValue(5)))::Type{A{Float64,_1}} where _1
# meta: pop location
return Core.SSAValue(9)
end::Type{A{Float64,_1}} where _1

julia> @code_warntype foo(A{Float64,0}, b, b, b, b)
Variables:
t<optimized out>
ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}, 1)::B{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}, 2)::B{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}, 3)::B{Float64,2}
return A{Float64,0}
end::Type{A{Float64,0}}
``````

Curiously, switching the position of c to anywhere but 2nd was type stable:

``````julia> @code_warntype foo(A{Float64,0}, c, b, b, b)
Variables:
t<optimized out>
ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}, 1)::B{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}, 2)::B{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},B{Float64,2}}, 3)::B{Float64,2}
return A{Float64,2}
end::Type{A{Float64,2}}

julia> @code_warntype foo(A{Float64,0}, b, c, b, b)
Variables:
t<optimized out>
ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}, 1)::C{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}, 2)::B{Float64,2}
Core.SSAValue(5) = (Core.getfield)(ts::Tuple{C{Float64,2},B{Float64,2},B{Float64,2}}, 3)::B{Float64,2}
# meta: location REPL[4] foo 1
Core.SSAValue(9) = \$(Expr(:invoke, MethodInstance for foo(::Type{A{Float64,2}}, ::B{Float64,2}, ::B{Float64,2}, ::Vararg{B{Float64,2},N} where N), :(Main.foo), A{Float64,2}, Core.SSAValue(4), Core.SSAValue(5)))::Type{A{Float64,_1}} where _1
# meta: pop location
return Core.SSAValue(9)
end::Type{A{Float64,_1}} where _1

julia> @code_warntype foo(A{Float64,0}, b, b, c, b)
Variables:
t<optimized out>
ts::Tuple{B{Float64,2},C{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{B{Float64,2},C{Float64,2},B{Float64,2}}, 1)::B{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{B{Float64,2},C{Float64,2},B{Float64,2}}, 2)::C{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},C{Float64,2},B{Float64,2}}, 3)::B{Float64,2}
return A{Float64,2}
end::Type{A{Float64,2}}

julia> @code_warntype foo(A{Float64,0}, b, b, b, c)
Variables:
t<optimized out>
ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2}}, 1)::B{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2}}, 2)::B{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2}}, 3)::C{Float64,2}
return A{Float64,2}
end::Type{A{Float64,2}}
``````

I’ve encountered a problem before where incrementally calling parametric functions made them all type stable, but when jumping ahead they are not. However, I started over in a new session and called them in a different order, and that doesn’t seem to be going on here.
I once again got that specifically the order `b, c, b, b` is not inferable, but `c` in the other three locations is.

Throwing extra `b`s on the end makes it switch back and fourth:

``````julia> @code_warntype foo(A{Float64,0}, b, b, b, c, b)
Variables:
t<optimized out>
ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2}}, 1)::B{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2}}, 2)::B{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2}}, 3)::C{Float64,2}
Core.SSAValue(6) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2}}, 4)::B{Float64,2}
return A{Float64,2}
end::Type{A{Float64,2}}

julia> @code_warntype foo(A{Float64,0}, b, b, b, c, b, b)
Variables:
t<optimized out>
ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2}}, 1)::B{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2}}, 2)::B{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2}}, 3)::C{Float64,2}
Core.SSAValue(6) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2}}, 4)::B{Float64,2}
Core.SSAValue(7) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2}}, 5)::B{Float64,2}
return A{Float64,2}
end::Type{A{Float64,_1}} where _1

julia> @code_warntype foo(A{Float64,0}, b, b, b, c, b, b, b)
Variables:
t<optimized out>
ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2},B{Float64,2}}

Body:
begin
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2},B{Float64,2}}, 1)::B{Float64,2}
Core.SSAValue(4) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2},B{Float64,2}}, 2)::B{Float64,2}
Core.SSAValue(5) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2},B{Float64,2}}, 3)::C{Float64,2}
Core.SSAValue(6) = (Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2},B{Float64,2}}, 4)::B{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2},B{Float64,2}}, 5)::B{Float64,2}
(Core.getfield)(ts::Tuple{B{Float64,2},B{Float64,2},C{Float64,2},B{Float64,2},B{Float64,2},B{Float64,2}}, 6)::B{Float64,2}
return A{Float64,2}
end::Type{A{Float64,2}}
``````

I have no idea what is going on.

Thanks @Elrod for reproducing. This matches my observations, inference goes a bit crazy here. I’ll wait for a bit and if nothing comes up I’ll open an issue.

Interesting! Jameson just commented on the Github issue above that this behaviour is by design. The compiler cannot demonstrate that the first recursive version of `foo` actually terminates and doesn’t enter into a recursion loop. Hence it bails out trying to infer its output type. If it didn’t do so we couldn’t have interprocedural constant propagation optimizations. Consequently, in v0.7+ one needs to be careful with this coding pattern of recursive methods. It is unfortunate, I find it rather elegant and powerful, and had gotten used to it from v0.6 and before. It does work very well if it is simple enough, for example

``````@inline tuplejoin(x) = x
@inline tuplejoin(x, y) = (x..., y...)
@inline tuplejoin(x, y, z...) = (x..., tuplejoin(y, z...)...)
``````

but it becomes quickly unworkable as one makes it more complex. You have been warned!

2 Likes

For completeness, I’ve been trying to find a tweak that makes the original recursive pattern workable. For some arcane reason, the following tweak, while still type-unstable and not-eliding, no longer allocates and recovers excellent performance

``````struct A{E, L}
end

struct B{E, L}
end

struct C{E, L}
end

foo(::Type{T}, t1, ts::Vararg{<:Any,N}) where {T, N} = foo(foo(T, t1), ts...)
foo(::Type{A{E, L}}, ::B{E2,L2}) where {E,E2,L,L2} = A{E2,L}
foo(::Type{A{E, L}}, ::C{E2,L2}) where {E,E2,L,L2} = A{E2,L2}
``````

Note the `Vararg{<:Any, N}`. That’s the key for some reason. Now:

``````julia> b, c = B{Float64,2}(), C{Float64,2}()
(B{Float64,2}(), C{Float64,2}())

julia> @btime foo(A{Float64,0}, \$b, \$c, \$b, \$b)
5.830 ns (0 allocations: 0 bytes)
A{Float64,2}
``````

As you increase the number of `b` and `c` parameters, the execution time grows, but only mildly, in contrast to the original example which exploded. This is so whatever the input pattern, as far as I can tell. In other words, this version performs similarly to the `tuplejoin` example above, despite being type-unstable! It seems to be a good solution for even more complex recursions of this kind, also with more than three types `A`, `B`, `C` and pairwise rules.

Any light on why `ts::Vararg{<:Any,N}` is so different than `ts...` would be very welcome!

FWIW, you might “solve” these problems by putting `@generated` in front of `foo` (at least, that’s my go-to strategy on 0.6). This makes the compiler run the recursion at compilation-time.

Good one! Indeed, that works.

``````@generated foo(::Type{T}, t, ts...) where {T} = :(foo(foo(T, t), ts...))
@generated foo(::Type{A{E, L}}, ::B{E2,L2}) where {E,E2,L,L2} = :(A{E2,L})
@generated foo(::Type{A{E, L}}, ::C{E2,L2}) where {E,E2,L,L2} = :(A{E2,L2})
``````
``````julia> @btime foo(A{Float64,0}, \$b, \$c, \$b, \$b)
1.873 ns (0 allocations: 0 bytes)
A{Float64,2}
``````

At some point in the past I loosely recall there was a recommendation to do things without generated functions if possible, but I don’t remember the reasons (compilation times, precompilation perhaps?). Do I remember correctly? If so, does the reason still apply in the new 1.0 world?

By the way, interestingly, the `@code_native` output using your `@generated` approach and the `ts::Vararg{<:Any,N}` approach is identical, and looks very compact to my untrained eye

``````julia> @code_native foo(A{Float64,0}, b, c, b, b)
.section	__TEXT,__text,regular,pure_instructions
; Function foo {
; Location: REPL[4]:1
subq	\$40, %rsp
movq	%rsi, 32(%rsp)
; Function @generated body; {
; Location: REPL[4]:1
; Function foo; {
; Location: REPL[4]:1
; Function @generated body; {
; Location: REPL[4]:1
movabsq	\$4546854976, %rax       ## imm = 0x10F038040
movq	%rax, (%rsp)
movabsq	\$4563281216, %rax       ## imm = 0x10FFE2540
movq	%rax, 8(%rsp)
movabsq	\$4546855192, %rax       ## imm = 0x10F038118
movq	%rax, 16(%rsp)
movq	%rax, 24(%rsp)
movabsq	\$jl_invoke, %rax
movabsq	\$4558629776, %rdi       ## imm = 0x10FB72B90
leaq	(%rsp), %rsi
movl	\$4, %edx
callq	*%rax
;}}
retq
;}}
``````

It’s interesting that it worked that way, but it’s not what I meant. I meant literally just putting `@generated`, without quoting the expression.

``````@generated foo(::Type{T}, t, ts...) where {T} = foo(foo(T, t), ts...)
...
``````

Then the generated body of `foo` becomes a constant (i.e. the computed type).

1 Like

Mmm, nope, that does not work for me. `foo(A{Float64,0}, b, c, b, b)` just hangs if I don’t quote the result (master and 0.6.2)… I thought generated functions where supposed to return a quoted expression, right?

Yes, but you can returned a quoted constant. Typically this just indicates you are doing something inappropriate for generated function usage (unsound). For similar reasons, it’s often a bad idea to have a generated function call itself (it’s bad enough that inference already might get run recursively inside it and think that it might call itself, resulting in a StackOverflow and crash, but so far we’ve usually managed to avoid that).

2 Likes