Base.Generator being slow because type inference fails with isnothing

Hey,

consider following two functions, to generate a generator:

function f(size; scaling=1.0)
    if isnothing(scaling)
        scaling = 1.0
    end
    return (sum(scaling .* Tuple(index)) for index in CartesianIndices(size))
end

function g(size; scaling=1.0)
    return (sum(scaling .* Tuple(index)) for index in CartesianIndices(size))
end

The two functions, despite called with the same arguments, perform very differently because the type of scaling is not inferred correctly in f:

julia> @time collect(g((100, 100)));
  0.000019 seconds (2 allocations: 78.203 KiB)

julia> @time collect(f((100, 100)));
  0.003087 seconds (50.01 k allocations: 1.450 MiB)

The collect is needed because f and g return both generators.

From @code_warntype we see can see the issue hat one returns generator where element type Float64 is returned:

julia> @code_warntype f((100, 100))
Variables
  #self#::Core.Const(f)
  size::Tuple{Int64, Int64}

Body::Base.Generator{CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#5#6"}
1 ─ %1 = Main.:(var"#f#4")(1.0, #self#, size)::Base.Generator{CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#5#6"}
└──      return %1

julia> @code_warntype (g((100, 100)))
Variables
  #self#::Core.Const(g)
  size::Tuple{Int64, Int64}

Body::Base.Generator{CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#8#9"{Float64}}
1 ─ %1 = Main.:(var"#g#7")(1.0, #self#, size)::Core.PartialStruct(Base.Generator{CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#8#9"{Float64}}, Any[Core.Const(var"#8#9"{Float64}(1.0)), CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}])
└──      return %1

I don’t really understand why it doesn’t infer the type correctly? Isn’t the generator generated after we checked that it isn’t nothing?

Even after enforcing the type, it fails:

julia> function f2(size; scaling::Float64=1.0)
           if isnothing(scaling)
               scaling = 1.0
           end
           return (sum(scaling .* Tuple(index)) for index in CartesianIndices(size))
       end
f2 (generic function with 1 method)

julia> @time collect(f2((100, 100)));
  0.013903 seconds (64.73 k allocations: 2.305 MiB, 81.88% compilation time)

julia> @time collect(f2((100, 100)));
  0.003256 seconds (50.01 k allocations: 1.450 MiB)

julia> @code_warntype f2((100, 100))
Variables
  #self#::Core.Const(f2)
  size::Tuple{Int64, Int64}

Body::Base.Generator{CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#32#33"}
1 ─ %1 = Main.:(var"#f2#31")(1.0, #self#, size)::Base.Generator{CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#32#33"}
└──      return %1

I would very appreciate if somebody could explain that :confused:

Thanks a lot!

Felix

I have seen suggested to use scaling === nothing as that infers better. Why, I don’t know.

Hm, that would be strange. However, it doesn’t help:

julia> function f3(size; scaling=1.0)
           if scaling === nothing
               scaling = 1.0
           end
           return (sum(scaling .* Tuple(index)) for index in CartesianIndices(size))
       end
f3 (generic function with 1 method)

julia> @time collect(f((100, 100)));
  0.002413 seconds (50.01 k allocations: 1.450 MiB)

julia> @time collect(f3((100, 100)));
  0.002505 seconds (50.01 k allocations: 1.450 MiB)

This is because assigning scaling multiple times causes scaling to be boxed if used inside closures. The usual workaround here is to write this as:

function f(size; scaling=1.0)
    if isnothing(scaling)
        scaling = 1.0
    end
    let scaling = scaling
        return (sum(scaling .* Tuple(index)) for index in CartesianIndices(size))
    end
end
3 Likes

Hm, that confuses me even more :laughing:

Does that hold more generally? So checking for isnothing and assigning a new value: Is that bad practice and might cause performance issues?

Ok, this seems to be fast as well:

julia> function f4(size; scaling=1.0)
           scaling = isnothing(scaling) ? 1 : scaling
           return (sum(scaling .* Tuple(index)) for index in CartesianIndices(size))
       end

julia> @time collect(f((100, 100)));
  0.002622 seconds (50.01 k allocations: 1.450 MiB)

So is this somehow due to the scope where scaling is assigned?

This is a well-known issue, see performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub.

Note that this is only an issue if that same variable is used inside a closure. It would be good to fix this eventually, but it’s quite a non-trivial problem because currently closure capturing is figured out in lowering and there would have to be a way to somehow make that aware of the results of type inference. Julia 1.7 introduces opaque closures, which work around this by having different semantics than regular closures.

2 Likes