What is the difference between copy() and deepcopy()?

julia> a = [1,2,3,4];
julia> b = a;
julia> a[1] = 4;
julia> b
4-element Array{Int64,1}:
 4
 2
 3
 4

This is the shallow copy AFAIK.

But at a glance, I couldn’t find clear differences between copy() and deepcopy(), although Docs says copy() creates a shallow copy of a collection.

5 Likes

None of the operations above are copies, they are assignments. For understanding the difference between copy and deepcopy, consider

A1 = [[1]]
A2 = copy(A1)
A1[1] === A2[1]                 # true
A3 = deepcopy(A1)
A1[1] === A3[1]                 # false
5 Likes

These are related to copying mutable objects. I assume you know the difference between copying by value and copying by reference. Let ‘a’ be a mutable object of some type.

b = a copies ‘a’ by reference, so ‘b’ and ‘a’ refer to the same object. Therefore b.field1 = 2 makes a.field1 == 2 true.

b = copy(a) makes a new object of the same type as the object which ‘a’ refers to, and makes ‘b’ refer to it. So there are now 2 different objects, one for ‘a’ and one for ‘b’. Then for each ‘field’ of the new object, b.field = a.field is called. Therefore, if the field type is immutable it will be copied by value. If it is mutable, it will be copied by reference. Notice that this is a shallow copy, as b.field1.field11 = 2 still makes a.field1.field11 == 2 true, where field1 is a mutable object of some type.

b = deepcopy(a) keeps unwrapping any mutables inside of ‘a’ until it reaches all the immutables at all the levels, and copies all the data and structure of the old object to a new object. So ‘b’ and ‘a’ become completely independent, and changing one at any level does not change the other. For example b.field1.field11.field111 = 2 does not change a.field1.field11.field111. This however comes at the cost of using more memory when dealing with deep structures, e.g. a root node in a deep tree.

53 Likes

Everything clear! Thank you.

Then
when would you use b=a instead of b=copy(a) if having mutable fields?
And
when would you use b=copy(a) instead of b=deepcopy(a) if having mutable fields?
Is it even needed the “copy” command?

I think will just be using arrays, tuples and dictionaries, not structs nor other complex things. I guess then it’s enough with copy.

So, I can not understand then, why in the case of String, it copies by reference,

a = ["hi"]
b = copy(a)
pointer(a) == pointer(b) #false of course
pointer(a[1]) == pointer(b[1]) #true! 

So it copies the string by reference while Strings are immutable in Julia.
Can you make this compatible with your explanation?

I guess maybe the point is not that it is mutable or not. It might be whether it is a plain data type, meaning it is immutable and contains no references to other values. right?

This is not related to whether strings are mutable or immutable, but their size in bits - since strings have variable size, they cannot be stored inline in arrays.

Either way, comparing strings by their pointers may or may not give you the result you’re looking for - we’re not in C. Egality checking is best done with ===.

1 Like

why do you take

A1 = [[1]]

Why not

A1 = [1]