Type unstable function because same variable name used twice

Hey,

I cannot come up with a minimal example yet, but I encountered that my function return was type unstable because of such a construct:

function foo(x)
    # some code

    y = (1, 2)   # first 
    y = [x[1] for i in x]   # second
   
    # more code
    return g(y)
end

The first y has a different type than the second y. Deleting the first y resulted in perfectly type-stable code, leaving it there, resulted in type-unstable code (at least @code_warntype warned me).

So I was wondering why the first y introduces difficulties since I didn’t use it actually and it has been there by mistake, but the function output is of course the same because the first y is never used.

I can come up with similar code, however it doesn’t fail completely with the output type, but the type of y is a union.

julia> function f(x::AbstractArray{T, N}) where {T, N}
           y = (1, 2)   
           y = [x[1] for i in x]
           return y
       end
f (generic function with 1 method)

julia> @code_warntype f([1,2,3])
Variables
  #self#::Core.Const(f)
  x::Vector{Int64}
  #12::var"#12#13"{Vector{Int64}}
  y::Union{Tuple{Int64, Int64}, Vector{Int64}}

Body::Vector{Int64}
1 ─      (y = Core.tuple(1, 2))
│   %2 = Main.:(var"#12#13")::Core.Const(var"#12#13")
│   %3 = Core.typeof(x)::Core.Const(Vector{Int64})
│   %4 = Core.apply_type(%2, %3)::Core.Const(var"#12#13"{Vector{Int64}})
│        (#12 = %new(%4, x))
│   %6 = #12::var"#12#13"{Vector{Int64}}
│   %7 = Base.Generator(%6, x)::Base.Generator{Vector{Int64}, var"#12#13"{Vector{Int64}}}
│        (y = Base.collect(%7))
└──      return y::Vector{Int64}

julia> function f(x::AbstractArray{T, N}) where {T, N}
           #y = (1, 2)   
           y = [x[1] for i in x]
           return y
       end
f (generic function with 1 method)

julia> @code_warntype f([1,2,3])
Variables
  #self#::Core.Const(f)
  x::Vector{Int64}
  #14::var"#14#15"{Vector{Int64}}
  y::Vector{Int64}

Body::Vector{Int64}
1 ─ %1 = Main.:(var"#14#15")::Core.Const(var"#14#15")
│   %2 = Core.typeof(x)::Core.Const(Vector{Int64})
│   %3 = Core.apply_type(%1, %2)::Core.Const(var"#14#15"{Vector{Int64}})
│        (#14 = %new(%3, x))
│   %5 = #14::var"#14#15"{Vector{Int64}}
│   %6 = Base.Generator(%5, x)::Base.Generator{Vector{Int64}, var"#14#15"{Vector{Int64}}}
│        (y = Base.collect(%6))
└──      return y

I would be very happy if someone could point me in the right direction why that is problematic.

Thanks,

Felix

With Julia, it is bad practice to re-assign a variable with a different type of value within a function.
The correct approach is to avoid using the same variable name twice within a function.
(just pick another variable name).

3 Likes

Okay, thanks. I expected something like that.

I think that, in principle, the compiler could understand this and avoid the instability, but it just isn’t clever enough (yet).

2 Likes

This type of “type instability” has no impact on performance and the compiler is smart enough to not have a problem with it.

The printer for code_warntype could be a bit smarter perhaps.

Feel free to reassign variables in functions as you wish. It is only a problem if the variable doesn’t get properly inferred at some place where it is used in the function.

6 Likes

In my first (non-working) example, I actually had a huge performance impact (factor 100 or something) and it was resolved by deleting the unnecessary line.
I never used that initial variable, it was just there.

Is that an issue or expected?
Otherwise I try further digging to come up with a real MWE.

Well, it would be interesting to see an actual example. Maybe you hit performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub.

4 Likes

While trying some things out I found that foo and foo2 compile to the same thing. So LLVM is probably just using dce to ignore it. Which makes your code quite odd.

function foo(x)
           # some code

           y = (1, 2)   # first
           y = [x[1] for i in x]   # second

           # more code
           return y
       end

function foo2(x)
           # some code

           # first
           y = [x[1] for i in x]   # second

           # more code
           return y
       end

Nice to know (and ideally that’s what one would expect), but the next example hopefully will demonstrate what I meant:

julia>julia> function f(x)
                  y = (1, 2)
                  y = [x[1] for i in x]
                  g(_) = y 
                  return g(x)
              end
f (generic function with 1 method)

julia> @code_warntype f([1.0, 2.0])
Variables
  #self#::Core.Const(f)
  x::Vector{Float64}
  #67::var"#67#68"{Vector{Float64}}
  g::var"#g#69"
  y::Core.Box

Body::Any
1 ─       (y = Core.Box())
│   %2  = Core.tuple(1, 2)::Core.Const((1, 2))
│         Core.setfield!(y, :contents, %2)
│   %4  = Main.:(var"#67#68")::Core.Const(var"#67#68")
│   %5  = Core.typeof(x)::Core.Const(Vector{Float64})
│   %6  = Core.apply_type(%4, %5)::Core.Const(var"#67#68"{Vector{Float64}})
│         (#67 = %new(%6, x))
│   %8  = #67::var"#67#68"{Vector{Float64}}
│   %9  = Base.Generator(%8, x)::Base.Generator{Vector{Float64}, var"#67#68"{Vector{Float64}}}
│   %10 = Base.collect(%9)::Vector{Float64}
│         Core.setfield!(y, :contents, %10)
│         (g = %new(Main.:(var"#g#69"), y))
│   %13 = (g)(x)::Any
└──       return %13

julia> @time f([1.0, 2.0])
  0.000001 seconds (3 allocations: 208 bytes)
2-element Vector{Float64}:
 1.0
 1.0

julia> @time f([1.0, 2.0])
  0.000000 seconds (3 allocations: 208 bytes)
2-element Vector{Float64}:
 1.0
 1.0

Deleting the problematic line, results in inferred type:

julia> function f(x)
           #y = (1, 2)
           y = [x[1] for i in x]
           g(_) = y 
           return g(x)
       end
f (generic function with 1 method)

julia> @code_warntype f([1.0, 2.0])
Variables
  #self#::Core.Const(f)
  x::Vector{Float64}
  #70::var"#70#71"{Vector{Float64}}
  g::var"#g#72"{Vector{Float64}}
  y::Vector{Float64}

Body::Vector{Float64}
1 ─ %1  = Main.:(var"#70#71")::Core.Const(var"#70#71")
│   %2  = Core.typeof(x)::Core.Const(Vector{Float64})
│   %3  = Core.apply_type(%1, %2)::Core.Const(var"#70#71"{Vector{Float64}})
│         (#70 = %new(%3, x))
│   %5  = #70::var"#70#71"{Vector{Float64}}
│   %6  = Base.Generator(%5, x)::Base.Generator{Vector{Float64}, var"#70#71"{Vector{Float64}}}
│         (y = Base.collect(%6))
│   %8  = Main.:(var"#g#72")::Core.Const(var"#g#72")
│   %9  = Core.typeof(y)::Core.Const(Vector{Float64})
│   %10 = Core.apply_type(%8, %9)::Core.Const(var"#g#72"{Vector{Float64}})
│         (g = %new(%10, y))
│   %12 = (g)(x)::Vector{Float64}
└──       return %12

julia> @time f([1.0, 2.0])
  0.000001 seconds (2 allocations: 192 bytes)
2-element Vector{Float64}:
 1.0
 1.0

julia> @time f([1.0, 2.0])
  0.000000 seconds (2 allocations: 192 bytes)
2-element Vector{Float64}:
 1.0
 1.0

So @code_warntype is clean now.

So am I breaking some of the performance tips?

Interesting example. Learning with you. Note that the instability is not dependent on the order of the assignments:

julia> function g(x)
           g(_) = y 
           y = (1,2)
           y = [1,2]
           return g(x)
       end
g (generic function with 1 method)

julia> @code_warntype g(1)
Variables
  #self#::Core.Const(g)
  x::Int64
  y::Core.Box
  g::var"#g#15"

Body::Any
1 ─      (y = Core.Box())
│        (g = %new(Main.:(var"#g#15"), y))
│   %3 = Core.tuple(1, 2)::Core.Const((1, 2))
│        Core.setfield!(y, :contents, %3)
│   %5 = Base.vect(1, 2)::Vector{Int64}
│        Core.setfield!(y, :contents, %5)
│   %7 = (g)(x)::Any
└──      return %7


Thus, to be stable, the compiler should have noticed that wherever g is called y is constant. It is understandable that this type inference is tricky. What if g was called twice, should two specialized versions of g be compiled (but not specialized to arguments, but to captured variables)?

Of course that is solved if y is now a parameter of g (i. e. g(y) = y).

What is happening here appears to be related to the discussion in this thread. From what I understand, the possibility of changing the value of a captured variable in a closure is a feature in some sense, but can lead to these problems. Probably the good advice here is that one should never change the value of a captured variable except if, for some reason, the purpose of the closure is to modify that variable. Probably in most cases, one should consider these as a kind of “avoid non-constant globals” performance tip, but in this case the non-constant variable is local to the scope of the enclosing function.

.

1 Like

Yes, that boxing is done by the frontend and is exactly performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub. Note that the type stability of y is irrelevant here, for example:

function f(x)
    y = "a"
    y = "b"
    g(_) = y 
    return g(x)
end

will also cause a Box.

4 Likes