Spurious allocations with @threads, when setting variables using if statements

Annotating a for loop with @threads, that depends on a variable fixed in an if statement, leads to many additional allocations inside the loop. Replacing the if-statement with a ternary-operator removes these additional allocations.
Minimal example:

function g!(f::AbstractVector, flag::Bool, dx::Float64, dy::Float64)
    Δ = 0.0 # Not really needed, since if-statement does not introduce new scope
    if flag Δ = dy else Δ = dx end

    @inbounds Threads.@threads for i in eachindex(f)
        f[i] = Δ * f[i]
    end
    return nothing
end

function g_ternary!(f::AbstractVector, flag::Bool, dx::Float64, dy::Float64)
    Δ = flag ? dy : dx

    @inbounds Threads.@threads for i in eachindex(f)
        f[i] = Δ * f[i]
    end
    return nothing
end

using BenchmarkTools
N, flag = 2^16, true
y = rand(ComplexF64, N)
dx, dy = rand(2)

@btime g!($y, $flag, $dx, $dy) # 259.165 μs (196140 allocations: 5.00 MiB)
@btime g_ternary!($y, $flag, $dx, $dy) # 7.934 μs (41 allocations: 3.86 KiB)
# Note: The 41 allocations are normal and appear due to @threads

Removing @threads, both variants become non-allocating, so the problem must lie somewhere in the multithreading.
Did I understand something wrong or is this a bug?

julia> versioninfo()
Julia Version 1.7.3
Commit 742b9abb4d (2022-05-06 12:58 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, tigerlake)
Environment:
  JULIA_NUM_THREADS = 8
julia> @code_warntype g!(y, flag, dx, dy)
MethodInstance for g!(::Vector{ComplexF64}, ::Bool, ::Float64, ::Float64)
  from g!(f::AbstractVector, flag::Bool, dx::Float64, dy::Float64) in Main at /tmp/foo.jl:51
Arguments
  #self#::Core.Const(g!)
  f::Vector{ComplexF64}
  flag::Bool
  dx::Float64
  dy::Float64
Locals
  threadsfor_fun::var"#150#threadsfor_fun#19"{var"#150#threadsfor_fun#18#20"{Vector{ComplexF64}, Base.OneTo{Int64}}}
  Δ::Core.Box

Δ is boxed. Δ = if flag dy else dx end matches more closely your ternary operator and with this change performance is identical for me for g! and g_ternary!

Why is it boxed, though? Is it because the variable is being reassigned conditionally?

it’s weird that it’s boxed only when @threads is used:

julia> function g(flag::Bool, dx::Float64, dy::Float64)
           Δ = 0.0
           if flag Δ = dy else Δ = dx end
           return Δ
       end

julia> @code_warntype g(true, 3.0, 1.0)
MethodInstance for g(::Bool, ::Float64, ::Float64)
  from g(flag::Bool, dx::Float64, dy::Float64) in Main at REPL[1]:1
Arguments
  #self#::Core.Const(g)
  flag::Bool
  dx::Float64
  dy::Float64
Locals
  Δ::Float64
Body::Float64
1 ─     (Δ = 0.0)
└──     goto #3 if not flag
2 ─     (Δ = dy)
└──     goto #4
3 ─     (Δ = dx)
4 ┄     return Δ


julia> function g(flag::Bool, dx::Float64, dy::Float64)
           Δ = 0.0
           if flag Δ = dy else Δ = dx end
           Threads.@threads for _ = 1:10
               f = Δ
           end
           return Δ
       end
g (generic function with 1 method)

julia> @code_warntype g(true, 3.0, 1.0)
MethodInstance for g(::Bool, ::Float64, ::Float64)
  from g(flag::Bool, dx::Float64, dy::Float64) in Main at REPL[3]:1
Arguments
  #self#::Core.Const(g)
  flag::Bool
  dx::Float64
  dy::Float64
Locals
  threadsfor_fun::var"#15#threadsfor_fun#2"{var"#15#threadsfor_fun#1#3"{UnitRange{Int64}}}
  Δ@_6::Core.Box
  threadsfor_fun#1::var"#15#threadsfor_fun#1#3"{UnitRange{Int64}}
  range::UnitRange{Int64}
  Δ@_9::Union{}
Body::Any

I don’t quite understand why compiler thinks \Delta may get modified in the loop

So the type instability explains where the allocations come from.
It’s somewhat strange that its only unstable in combination with @threads but since we have a workaround, @giordano 's answer resolves the issue for me :+1:
Thanks a lot everyone!

This is probably the infamous bug of the performance penalty of closures, related to @threads creating a closure.

See: Type-instability because of @threads boxing variables

1 Like