What is the difference between copy and deepcopy?

Dear all,

I want to know the difference between copy and deepcopy?
Many thanks.

We should not encourage anybody to call deepcopy. It seems like the right thing if you’re just experimenting around with nested arrays and such, but it is almost never the right choice in real code, and using it is a sign that the design and data model needs to be thought through more. So really there is only copy, and the way to think about it is that it operates only on the object you pass it.

From Should `copy` be renamed `shallowcopy`? · Issue #42796 · JuliaLang/julia · GitHub

1 Like
julia> x = rand(4);

julia> xa = [x];

julia> copy(xa)[1] === x
true

julia> deepcopy(xa)[1] === x
false
4 Likes

Thanks for your suggestions.

Do you mean in your example, we should use deepcopy instead of copy?

No, I only showed the difference.

I agree that copy should be preferred.

1 Like

I think this example is more important to understand what copy does:

julia> struct example
       a::Vector{Int}
       end

julia> e=example([1,2,3])
example([1, 2, 3])

julia> e_copy=e  # this is like copy, but copy isn't defined for struct example
example([1, 2, 3])

julia> e_copy.a[1]=100
100

julia> e
example([100, 2, 3])

You see, the vector inside the struct is the same object in e and e_copy. Change a in e_copy changes also a in e.

1 Like

If we need to copy nested structures like a Plots.jl plot object in order to modify inner properties, I think we have to use deepcopy.

Example:

using Plots
p1 = plot(sin, title = "old title")
p2 = deepcopy(p1)      # copy would issue MethodError
p2[1][:title] = "new title"
plot(p1, p2)
1 Like

Maybe one can say this is true for structs, but there is no behavior or code that gets executed upon assignment, unlike some other languages (e.g. C++) where this would call the copy constructor.

To add to my example:

julia> xa === xa
true

julia> copy(xa) === xa
false

meaning

julia> xb = xa;

julia> xc = copy(xa);

julia> xa[1] = rand(10);

julia> xa[1] === xb[1]
true

julia> xa[1] === xc[1]
false

A practical example, showing the difference:

x = [[0,1], [2,3]]
y = copy(x)
z = deepcopy(x)
x[2] .= [-2,-1]
y[2]  # [-2,-1]
z[2]  # [2,3]
2 Likes

I wanted to show what copy does for nested types.
For OPs understanding it’s not important that deepcopy is not recommended or using of deepcopy is hinting on bad desing. OP wants to understand the difference of copy and deepcopy. @ rafael.guerra 's example shows the same without struct. It’s about nested objects and recursive copy (deepcopy) or shallow copy (copy).

3 Likes

I’ve never gotten why this functionality is preferred or useful.

Copy() is more like alias

DeepCopy() is actually copy

Why create a new variable that is just an alias? Just use the original.

When I make a copy (err, deepcopy) I want to leave the original alone and change the copy (err, deepcopy).

1 Like

copy differs from alias:

julia> x = [1]
1-element Vector{Int64}:
 1

julia> y = copy(x)
1-element Vector{Int64}:
 1

julia> push!(x, 2)
2-element Vector{Int64}:
 1
 2

julia> x
2-element Vector{Int64}:
 1
 2

julia> y
1-element Vector{Int64}:
 1
2 Likes

But, if copy and deepcopy are equivalent, meaning there is no deep structure, do the two functions cost roughly the same? If so, I’d go with deepcopy.

1 Like

To me copy is more like "unalias (aka, make a copy that specifically does not alias to the same memory), but only one level deep"

I also fully agree with the sentiment that if you find yourself frequently needing deepcopy, that probably suggests a suboptimal design / is a mild code smell (though I’m sure there are exceptions)

Obviously a Vector{Int} isn’t going to be representative of all the flat data structures out there, but deepcopy seems to require more allocations and more time.

Expand example.
julia> x = collect(1:20);
julia> @btime copy($x);
  32.502 ns (1 allocation: 224 bytes)
julia> @btime deepcopy($x);
  123.883 ns (3 allocations: 560 bytes)
julia> x = collect(1:20000);
julia> @btime copy($x);
  9.490 μs (2 allocations: 156.30 KiB)
julia> @btime deepcopy($x);
  10.579 μs (4 allocations: 156.62 KiB)

I think the choice of copy vs deepcopy really only matters with objects that reference other objects, that’s where there can be 2 notions of a copy. copy is the more straightforward one: it makes a duplicate of an instance with the exact same content. The catch is, that content isn’t just data, there are also references to other objects. Therefore, the duplicate must share referenced data with the original. deepcopy makes a duplicate with different references to recursively duplicated objects, so while all the references are different, the duplicate represents the same yet fully independent data as the original.

3 Likes

Notice something interesting about your test: deepcopy apparently copies more than copy.
The issue is not what gets copied, but that it is surprising to find that there is a difference. Hence,
it is a good defensive move to use deepcopy, unless I know for sure that copy is good enough for my purpose.

1 Like

In particular, why are you copying? Presumably because you are going to modify something and don’t want that modification to affect the original. If you are going to make a modification, you know where you’re going to make that modification and only need to copy the thing that you’re going to modify. This is often at the outermost level, so copy is all you need. Sometimes it’s a level or two deeper, in which case deepcopy is overkill but might be more convenient than arranging copies down to the rigth level.

9 Likes

I agree, in machine learning i have always found that people run into bugs when copying models for example. You almost never want copy there but deepcopy instead. As such i would have preferred copy=deepcopy and shallowcopy=copy. I understand why it is not so but in my line of work it would have simplified things. :slightly_smiling_face:

This is not a Julia thing though, we have the same issues popping up in python too. It also seems pretty uncontentious to use copy the way it’s currently implemented in Julia.

1 Like

This is precisely why it’s nice (but also a little fraught) to allow the model or object itself to define what copy should do and what constitutes its “outer structure.” Perhaps copy(::DeepLearningModel) should indeed make a copy of all the arrays it holds that define its weights because those are the things you modify directly when calling train!(::DeepLearningModel)? That is, for example, exactly what AlphaZero does.

4 Likes