What is the fastest way of making a vector of dictionaries?

I know the type (Dict{Int,Int}) and I know the size of the vector.
How do I make the vector quickly? (Just to be explicit, all those dictionaries
are independent; I do not want to fill the vector with just a single dictionary.)

Just write a comprehension and call it a day? It’s inconceivable to me that constructing a vector of empty dictionaries is a performance-critical step in a real application — do you have profiling evidence to the contrary?

1 Like

The problem is that there may be several million of them, and the construction should ideally scale with a number of computing threads.

This is kind of okay,

    empty = Dict{Int,Int}()
    rows = fill(empty, size(pattern, 2))
    Threads.@threads for c in axes(pattern, 2)
        cr = pattern.colptr[c]:(pattern.colptr[c+1]-1)
        rows[c] = Dict{Int,Int}(zip(view(pattern.rowval, cr), cr))

but there might be a better (faster) way?

Presumably you are going to do something with these millions of empty dictionaries. Won’t that dominate your runtime?

True, the loop fills them, and then I want to read them.
But if the whole thing is sequential, eventually it will spoil parallel scaling.

Are you actually at that point for a realistic run, or is this just theoretical?

That being said, there are multiple implementations of multi-threaded map lying around, e.g. in ThreadedIterables.jl, which should probably be simpler than the code you posted.

1 Like

The construction takes enough time to reduce parallel efficiency to 0.5 for 16 computing threads. So, yes.

Alas, it would appear the whole idea was a washout.
Preliminary investigation (What is faster: sparse vector or a dictionary? - #12 by PetrKryslUCSD) seemed to indicate that I could get a speedup by using dictionaries, yet in actual application they are much slower than the original approach.

I’m not surprised — if allocating empty dictionaries takes a significant fraction of your compute time, then you must do only a small number of subsequent operations per dictionary. For very small dictionaries being accessed a limited number of times it will be probably faster to use another data structure (even linear search).

(For performance optimizing dictionaries, one may also want to consider custom hash function — Dictionary with custom hash function)

1 Like

Not so. Creating the vector is relatively cheap, and it scales. What turned out to be expensive was to access the non-empty dictionaries. The minimal example needed to be actually made more realistic by using N=27, so you are right that the dictionaries are rather small. But the minimal working example still indicated that they should be faster than binary search. In practice, they turned out to be much much slower.

Note that with this pattern you are filling each row with the same dictionary:

julia> empty = Dict{Int,Int}()
Dict{Int64, Int64}()

julia> rows = fill(empty, 2)
2-element Vector{Dict{Int64, Int64}}:

julia> rows[1][1] = 1

julia> rows[1]
Dict{Int64, Int64} with 1 entry:
  1 => 1

julia> rows[2]
Dict{Int64, Int64} with 1 entry:
  1 => 1

and that here:

        rows[c] = Dict{Int,Int}(zip(view(pattern.rowval, cr), cr))

you are replacing that empty dictionary in row c by a new one. So probably you just need

rows = Vector{Dict{Int,Int}}(undef, size(patterns,2))

to initialize your structure (independently of that being the best data structure there, or not).


Your way of creating the vector is optimal, thanks. Unfortunately, that was never the bottleneck.

1 Like