Slow Dict comprehension

The compilation time doesn’t scale with N though:

julia> @time Dict{Symbol,Any}((Symbol(:a,i) => fill(float(i), 10) for i = 1:5_000_000));
  7.885567 seconds (35.03 M allocations: 2.072 GiB, 7.24% gc time, 0.22% compilation time)

julia> @time begin
                 p = Dict{Symbol,Any}()
                 for i = 1:5_000_000
                    p[Symbol(:a,i)] = fill(float(i), 10)
                 end
              end;
  7.877991 seconds (35.00 M allocations: 2.200 GiB)

So I would say you’re fine using a comprehension in practice. It’s unlikely you’ll have hundreds of these in your top-level script?

My hunch is that the reason that this particular for loop doesn’t require compilation is that the operations (looping over a range, assigning to a Dict{Symbol,Any}, fill(::Float64, ::Int) etc.) are all already compiled into the Base sysimage.

$ julia --sysimage-native-code=no
julia> @time begin
                 p = Dict{Symbol,Any}()
                 for i = 1:500
                    p[Symbol(:a,i)] = fill(float(i), 10)
                 end
              end;
  0.030967 seconds (4.89 k allocations: 307.391 KiB, 98.49% compilation time)

At the same time, the comprehension can’t be cached, because it constructs a unique iterator for each call, each of which have their own unique anonymous function that needs to be compiled:

julia> itr = (Symbol(:a,i) => fill(float(i), 10) for i = 1:500)
Base.Generator{UnitRange{Int64}, var"#3#4"}(var"#3#4"(), 1:500)

julia> itr = (Symbol(:a,i) => fill(float(i), 10) for i = 1:500)
Base.Generator{UnitRange{Int64}, var"#5#6"}(var"#5#6"(), 1:500)

Note var"#3#4" and var"#5#6".

4 Likes