Dictionary use and assignment

I have a question related to dictionary initialization. The code snippet illustrates the problem:

function DictGen(KeyList, InitVal)
    return Dict(i => InitVal for i in collect(Iterators.product(KeyList...)))
end

# make some key indices
N = ["N"*string(n) for n = 1:3]
T = ["T"*string(t) for t = 1:5]

# manual dictionary generation
D1 = Dict(i => Array{String}(undef,0) for i in collect(Iterators.product([N,T]...)))
push!(D1["N1","T1"], "one")

# equivalent dictionary generation wih function
D2 = DictGen([N,T], Array{String}(undef,0))
push!(D2["N1","T1"], "one")

The goal in this code is to produce a dictionary with keys generated from an Array of strings, and set an empty array as a default value. Subsequently, I want to push!() values to the dictionary to build up lists.

It works as intended if executed manually, but not if wrapped in a function. Here are D1 and D2 on execution of the script:

julia> D1
Dict{Tuple{String, String}, Vector{String}} with 15 entries:
  ("N1", "T5") => []
  ("N1", "T3") => []
  ("N3", "T5") => []
  ("N2", "T2") => []
  ("N2", "T3") => []
  ("N1", "T1") => ["one"]
  ("N1", "T2") => []
  ("N2", "T5") => []
  ("N3", "T3") => []
  ("N3", "T4") => []
  ⋮            => ⋮

julia> D2
Dict{Tuple{String, String}, Vector{String}} with 15 entries:
  ("N1", "T5") => ["one"]
  ("N1", "T3") => ["one"]
  ("N3", "T5") => ["one"]
  ("N2", "T2") => ["one"]
  ("N2", "T3") => ["one"]
  ("N1", "T1") => ["one"]
  ("N1", "T2") => ["one"]
  ("N2", "T5") => ["one"]
  ("N3", "T3") => ["one"]
  ("N3", "T4") => ["one"]
  ⋮            => ⋮

D1 is the anticipated behavior, D2 seems like it ought to be equivalent, but apparently it is not. Can someone explain how these work, and provide the working function equivalent?

This will be easier to understand if we strip away some unnecessary details. We don’t need to involve dictionaries, functions vs. global scope, or generators here to see the difference. It all boils down to:

julia> x = [Vector{Int}() for i in 1:3]
3-element Vector{Vector{Int64}}:
 []
 []
 []

julia> x[1] === x[2]
false

In this case, we’re constructing a vector of vectors, where each element is a new vector, hence x[1] and x[2] are not the same object.

Now consider, instead:

julia> v = Vector{Int}()
Int64[]

julia> y = [v for i in 1:3]
3-element Vector{Vector{Int64}}:
 []
 []
 []

julia> y[1] === y[2]
true

In this case, every element of y is the same vector v. That means that modifying one will modify them all:

julia> push!(y[1], 2)
1-element Vector{Int64}:
 2

julia> y
3-element Vector{Vector{Int64}}:
 [2]
 [2]
 [2]

On the other hand, what if we make a function which generates our initial values:

julia> f = () -> Vector{Int}()
#15 (generic function with 1 method)

julia> z = [f() for i in 1:3]
3-element Vector{Vector{Int64}}:
 []
 []
 []

julia> z[1] === z[2]
false

This does what you want, and results in a different vector for each element of z. You can do the same thing in your DictGen: Rather than taking an initial value take a function to generate that value. Then you can do:

function DictGen(KeyList, InitFunction)
  Dict(i => InitFunction() for i in .....)
end

Edit: And, by the way, you can simplify the function f in this case. Rather than writing:

() -> Vector{Int}()

you can just write Vector{Int} which is exactly the same thing (a function you can call to get a new vector of ints).

3 Likes

Thank you! I see the conceptual difference. Thank you for laying it all out clearly.

For posterity, here is my worked solution, with all the syntax worked out. I hope someone else finds this helpful, as well.

# Function with multiple dispatch
function DictGen(KeyList, InitVal::Function)
    return Dict(i => InitVal() for i in collect(Iterators.product(KeyList...)))
end
function DictGen(KeyList, InitVal)
    return Dict(i => InitVal for i in collect(Iterators.product(KeyList...)))
end

# String array generator
function InitStringArray()
    return Array{String}(undef,0)
end

# make some key indices
N = ["N"*string(n) for n = 1:3]
T = ["T"*string(t) for t = 1:5]

# Multiple dipatch example
D0 = DictGen([N,T], 0)

# manual dictionary generation
D1 = Dict(i => Array{String}(undef,0) for i in collect(Iterators.product([N,T]...)))
push!(D1["N1","T1"], "one")

# equivalent dictionary generation wih function
D2 = DictGen([N,T], InitStringArray)

# checks; D2 evaluates false, as required. So does D0, just for confirmation.
push!(D2["N1","T1"], "one")
D2["N2","T1"] === D2["N2","T2"]
D0["N2","T1"] === D2["N2","T2"]