Type unstable function because same variable name used twice

Hey,

I cannot come up with a minimal example yet, but I encountered that my function return was type unstable because of such a construct:

function foo(x)
    # some code

    y = (1, 2)   # first 
    y = [x[1] for i in x]   # second
   
    # more code
    return g(y)
end

The first y has a different type than the second y. Deleting the first y resulted in perfectly type-stable code, leaving it there, resulted in type-unstable code (at least @code_warntype warned me).

So I was wondering why the first y introduces difficulties since I didn’t use it actually and it has been there by mistake, but the function output is of course the same because the first y is never used.

I can come up with similar code, however it doesn’t fail completely with the output type, but the type of y is a union.

julia> function f(x::AbstractArray{T, N}) where {T, N}
           y = (1, 2)   
           y = [x[1] for i in x]
           return y
       end
f (generic function with 1 method)

julia> @code_warntype f([1,2,3])
Variables
  #self#::Core.Const(f)
  x::Vector{Int64}
  #12::var"#12#13"{Vector{Int64}}
  y::Union{Tuple{Int64, Int64}, Vector{Int64}}

Body::Vector{Int64}
1 ─      (y = Core.tuple(1, 2))
│   %2 = Main.:(var"#12#13")::Core.Const(var"#12#13")
│   %3 = Core.typeof(x)::Core.Const(Vector{Int64})
│   %4 = Core.apply_type(%2, %3)::Core.Const(var"#12#13"{Vector{Int64}})
│        (#12 = %new(%4, x))
│   %6 = #12::var"#12#13"{Vector{Int64}}
│   %7 = Base.Generator(%6, x)::Base.Generator{Vector{Int64}, var"#12#13"{Vector{Int64}}}
│        (y = Base.collect(%7))
└──      return y::Vector{Int64}

julia> function f(x::AbstractArray{T, N}) where {T, N}
           #y = (1, 2)   
           y = [x[1] for i in x]
           return y
       end
f (generic function with 1 method)

julia> @code_warntype f([1,2,3])
Variables
  #self#::Core.Const(f)
  x::Vector{Int64}
  #14::var"#14#15"{Vector{Int64}}
  y::Vector{Int64}

Body::Vector{Int64}
1 ─ %1 = Main.:(var"#14#15")::Core.Const(var"#14#15")
│   %2 = Core.typeof(x)::Core.Const(Vector{Int64})
│   %3 = Core.apply_type(%1, %2)::Core.Const(var"#14#15"{Vector{Int64}})
│        (#14 = %new(%3, x))
│   %5 = #14::var"#14#15"{Vector{Int64}}
│   %6 = Base.Generator(%5, x)::Base.Generator{Vector{Int64}, var"#14#15"{Vector{Int64}}}
│        (y = Base.collect(%6))
└──      return y

I would be very happy if someone could point me in the right direction why that is problematic.

Thanks,

Felix

With Julia, it is bad practice to re-assign a variable with a different type of value within a function.
The correct approach is to avoid using the same variable name twice within a function.
(just pick another variable name).

Okay, thanks. I expected something like that.

I think that, in principle, the compiler could understand this and avoid the instability, but it just isn’t clever enough (yet).

This type of “type instability” has no impact on performance and the compiler is smart enough to not have a problem with it.

The printer for code_warntype could be a bit smarter perhaps.

Feel free to reassign variables in functions as you wish. It is only a problem if the variable doesn’t get properly inferred at some place where it is used in the function.

In my first (non-working) example, I actually had a huge performance impact (factor 100 or something) and it was resolved by deleting the unnecessary line.
I never used that initial variable, it was just there.

Is that an issue or expected?
Otherwise I try further digging to come up with a real MWE.

Well, it would be interesting to see an actual example. Maybe you hit performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub.

While trying some things out I found that foo and foo2 compile to the same thing. So LLVM is probably just using dce to ignore it. Which makes your code quite odd.

function foo(x)
           # some code

           y = (1, 2)   # first
           y = [x[1] for i in x]   # second

           # more code
           return y
       end

function foo2(x)
           # some code

           # first
           y = [x[1] for i in x]   # second

           # more code
           return y
       end

Nice to know (and ideally that’s what one would expect), but the next example hopefully will demonstrate what I meant:

julia>julia> function f(x)
                  y = (1, 2)
                  y = [x[1] for i in x]
                  g(_) = y 
                  return g(x)
              end
f (generic function with 1 method)

julia> @code_warntype f([1.0, 2.0])
Variables
  #self#::Core.Const(f)
  x::Vector{Float64}
  #67::var"#67#68"{Vector{Float64}}
  g::var"#g#69"
  y::Core.Box

Body::Any
1 ─       (y = Core.Box())
│   %2  = Core.tuple(1, 2)::Core.Const((1, 2))
│         Core.setfield!(y, :contents, %2)
│   %4  = Main.:(var"#67#68")::Core.Const(var"#67#68")
│   %5  = Core.typeof(x)::Core.Const(Vector{Float64})
│   %6  = Core.apply_type(%4, %5)::Core.Const(var"#67#68"{Vector{Float64}})
│         (#67 = %new(%6, x))
│   %8  = #67::var"#67#68"{Vector{Float64}}
│   %9  = Base.Generator(%8, x)::Base.Generator{Vector{Float64}, var"#67#68"{Vector{Float64}}}
│   %10 = Base.collect(%9)::Vector{Float64}
│         Core.setfield!(y, :contents, %10)
│         (g = %new(Main.:(var"#g#69"), y))
│   %13 = (g)(x)::Any
└──       return %13

julia> @time f([1.0, 2.0])
  0.000001 seconds (3 allocations: 208 bytes)
2-element Vector{Float64}:
 1.0
 1.0

julia> @time f([1.0, 2.0])
  0.000000 seconds (3 allocations: 208 bytes)
2-element Vector{Float64}:
 1.0
 1.0

Deleting the problematic line, results in inferred type:

julia> function f(x)
           #y = (1, 2)
           y = [x[1] for i in x]
           g(_) = y 
           return g(x)
       end
f (generic function with 1 method)

julia> @code_warntype f([1.0, 2.0])
Variables
  #self#::Core.Const(f)
  x::Vector{Float64}
  #70::var"#70#71"{Vector{Float64}}
  g::var"#g#72"{Vector{Float64}}
  y::Vector{Float64}

Body::Vector{Float64}
1 ─ %1  = Main.:(var"#70#71")::Core.Const(var"#70#71")
│   %2  = Core.typeof(x)::Core.Const(Vector{Float64})
│   %3  = Core.apply_type(%1, %2)::Core.Const(var"#70#71"{Vector{Float64}})
│         (#70 = %new(%3, x))
│   %5  = #70::var"#70#71"{Vector{Float64}}
│   %6  = Base.Generator(%5, x)::Base.Generator{Vector{Float64}, var"#70#71"{Vector{Float64}}}
│         (y = Base.collect(%6))
│   %8  = Main.:(var"#g#72")::Core.Const(var"#g#72")
│   %9  = Core.typeof(y)::Core.Const(Vector{Float64})
│   %10 = Core.apply_type(%8, %9)::Core.Const(var"#g#72"{Vector{Float64}})
│         (g = %new(%10, y))
│   %12 = (g)(x)::Vector{Float64}
└──       return %12

julia> @time f([1.0, 2.0])
  0.000001 seconds (2 allocations: 192 bytes)
2-element Vector{Float64}:
 1.0
 1.0

julia> @time f([1.0, 2.0])
  0.000000 seconds (2 allocations: 192 bytes)
2-element Vector{Float64}:
 1.0
 1.0

So @code_warntype is clean now.

So am I breaking some of the performance tips?

Interesting example. Learning with you. Note that the instability is not dependent on the order of the assignments:

julia> function g(x)
           g(_) = y 
           y = (1,2)
           y = [1,2]
           return g(x)
       end
g (generic function with 1 method)

julia> @code_warntype g(1)
Variables
  #self#::Core.Const(g)
  x::Int64
  y::Core.Box
  g::var"#g#15"

Body::Any
1 ─      (y = Core.Box())
│        (g = %new(Main.:(var"#g#15"), y))
│   %3 = Core.tuple(1, 2)::Core.Const((1, 2))
│        Core.setfield!(y, :contents, %3)
│   %5 = Base.vect(1, 2)::Vector{Int64}
│        Core.setfield!(y, :contents, %5)
│   %7 = (g)(x)::Any
└──      return %7


Thus, to be stable, the compiler should have noticed that wherever g is called y is constant. It is understandable that this type inference is tricky. What if g was called twice, should two specialized versions of g be compiled (but not specialized to arguments, but to captured variables)?

Of course that is solved if y is now a parameter of g (i. e. g(y) = y).

What is happening here appears to be related to the discussion in this thread. From what I understand, the possibility of changing the value of a captured variable in a closure is a feature in some sense, but can lead to these problems. Probably the good advice here is that one should never change the value of a captured variable except if, for some reason, the purpose of the closure is to modify that variable. Probably in most cases, one should consider these as a kind of “avoid non-constant globals” performance tip, but in this case the non-constant variable is local to the scope of the enclosing function.

.

Yes, that boxing is done by the frontend and is exactly performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub. Note that the type stability of y is irrelevant here, for example:

function f(x)
    y = "a"
    y = "b"
    g(_) = y 
    return g(x)
end

will also cause a Box.