Unexpected copy when pushing

Hello, I’m a very early beginner. I wanted to have an array of dictionaries. Simple example below. It seems that when I push a dictionary which does not exactly match the type of array, instead of, say, getting a complaint from the compiler, the dictionary is copied. Fine, but naturally one would like to know when this will happen, better approach, etc. Thanks for any help!
(As part of learning I’m trying to declare most types explicitly for now.)

a = Dict{String, Union{Int64, Nothing}}[]
x = Dict("wet"=>1)
push!(a, x)
x["dry"]= 2
println("x == a[1] is now  "* string(x == a[1]))  #it's false, and  keys are different
y = Dict{String, Int64}("fish"=>3)
push!(a,y)
y["fowl"]=4
println("y == a[2] is now  "* string(y == a[2])) #it's false, and keys are different
z = Dict{String, Union{Int64, Nothing}}("hot"=>4)
push!(a,z)
z["cold"]=5
println("z == a[3] is now still "* string(z == a[3])) # it's true.

Please quote your code (you can edit your post), and perhaps don’t make all text bold, it is a distraction.

I’ve gone ahead and quoted @jbaxter’s code since otherwise it was impossible to cut and paste into a REPL to try to help answer the question.

The key issue is that Dict{String, Union{Int, Nothing}} and Dict{String, Int} are different, disjoint types. In order to put a Dict{String, Int} value into an array with element type Dict{String, Union{Int, Nothing}} it must be converted to a value of the correct type, which means copying it. If you typed your array like this then no conversion would be required:

a = Dict{String, <:Union{Int64, Nothing}}[]

However, I should mention that once you get to such loosely typed abstract collections, it’s harder on the compiler to generate specialized code and unless you really want to make sure nobody puts the wrong type of value in there, you’re better off just using an Any-typed array, i.e. just:

a = []

This has the same array-of-pointers representation anyway and already has all the code one might need generated for it, as opposed to Vector{Dict{String, <:Union{Int64, Nothing}}} which is a new type that’s likely to need all new code generated for it.

4 Likes

A tangential point: debug prints like these

can be easier with the @show macro, which saves you the trouble of writing the expression again inside a string literal.

thanks for all these points! Fewer explicit types seems to be best for me.

I’m trying now to take a general lesson from this example, to help write correct programs in the future. I know Julia uses pass by sharing, but something else is evidently involved here. Is there a general rule about when Julia makes a copy of a mutable value instead of passing a reference, or issuing an error message? (Or likely a section of the documentation I should read to understand this particular issue!)
I understand that, as was stated, the copying takes place to match the type. But I suppose the passed type has to be close in some sense to the correct one or Julia would warn me of my mistake.
Thanks again.

No. For immutables, it is actually undefined what happens under the hood, when the user can’t tell the difference. This gives the compiler room for optimizations, see eg a recent discussion.

I think that the general rule in Base and the standard library is that things are never copied implicitly, but converted when necessary. See

https://docs.julialang.org/en/v1/manual/conversion-and-promotion/#When-is-convert-called?-1

That is very helpful. I read some of the linked manual section, especially “conversion versus construction”. For me the key point was not whether the operation is called “copying”, but whether a new object is created, as in my example it seems to be. In the manual it says that since convert can be called implicitly, its methods are restricted to cases that are considered “safe” or “unsurprising”. I have to admit that I was surprised to find that updating the mutable object I pushed made no difference to the object I had appended to the array. However, I am sure there are good reasons for this behavior. Anyway I will keep the link you provided handy and try to be careful (being careful is not easy for me :persevere:). Thanks again.

Implicit conversion of mutable types is a bit dangerous since it amounts to implicit copying. Can’t be changed now but worth keeping in mind for Julia 2.0. Then again people love the convenience of implicit conversion based on types locations, especially since Julia is very careful about conversions that don’t preserve value.