Actually the interpolation works with @spawn
, not with @threads
:
julia> function f1_2(vcs) # very slow
T = Int
smap = Dict("a" => [1:500;], "b" => [1:500;])
@sync for chunk in makechunks(vcs, nthreads())
@spawn begin
c = zero($T)
for v in $chunk
for s in v
for idx in $smap[s]
c += one($T)
end
end
end
end
end
return
end
f1_2 (generic function with 1 method)
julia> @time f1_2(fill(["a","b"], 500))
0.045161 seconds (538.60 k allocations: 10.318 MiB, 4.56% gc time, 417.59% compilation time)
julia> @time f1_2(fill(["a","b"], 500))
0.021519 seconds (495.98 k allocations: 7.588 MiB, 8.42% gc time)
Which I used as a solution before (see Type-instability because of @threads boxing variables - #21 by Elrod). It should be equivalent to creating a let block, but that doesn’t seem to be doing the job here:
julia> function f1_2(vcs) # very slow
T = Int
smap = Dict("a" => [1:500;], "b" => [1:500;])
@threads for chunk in makechunks(vcs, nthreads())
let T = T, chunk = chunk, smap = smap
c = zero(T)
for v in chunk
for s in v
for idx in smap[s]
c += one(T)
end
end
end
end
end
return
end
f1_2 (generic function with 1 method)
julia> @time f1_2(fill(["a","b"], 500))
0.048589 seconds (508.51 k allocations: 8.427 MiB, 486.98% compilation time)
julia> @time f1_2(fill(["a","b"], 500))
0.021447 seconds (495.97 k allocations: 7.588 MiB, 12.07% gc time)
Probably there is something I’m missing.