Deepcopying struct with array fields

I have a struct containing fields, some of which are multidimensional Arrays, however I want to be able to copy (both deep and shallow) between instances of this type. Comments on an answer from SO indicate that this can be done using

Base.copy(x::T) where T = T([getfield(x, k) for k ∈ fieldnames(T)]...)
Base.deepcopy(x::T) where T = T([deepcopy(getfield(x, k)) for k ∈ fieldnames(T)]...)

Whilst this copy functions accordingly, the deepcopy does not appear to work for arrays, i.e.

mutable struct MyStruct1
    field1::Array{Float64,1}
end

mutable struct MyStruct2
    field1::Array{Float64,1}
    field2::Array{Float64,2}
end

Base.copy(x::T) where T = T([getfield(x, k) for k ∈ fieldnames(T)]...)
Base.deepcopy(x::T) where T = T([deepcopy(getfield(x, k)) for k ∈ fieldnames(T)]...)

ms1a = MyStruct1(zeros(1))
# MyStruct1([0.0])

ms1b = copy(ms1a)
# MyStruct1([0.0])

ms1c = deepcopy(ms1a)
# MyStruct1(Float64[])

ms2a = MyStruct2(zeros(1), zeros(1,1))
# MyStruct2([0.0], [0.0])

ms2b = copy(ms2a)
# MyStruct2([0.0], [0.0])

ms2c = deepcopy(ma2a)
ERROR: MethodError: no method matching Array{Float64,2}()
Closest candidates are:
  Array{Float64,2}(::UndefInitializer, ::Int64, ::Int64) where T at boot.jl:404
  Array{Float64,2}(::UndefInitializer, ::Int64...) where {T, N} at boot.jl:408
  Array{Float64,2}(::UndefInitializer, ::Tuple{Int64,Int64}) where T at boot.jl:412
  ...
Stacktrace:
 [1] deepcopy(::Array{Float64,2}) at ./REPL[4]:1
 [2] (::getfield(Main, Symbol("##5#6")){MyStruct2})(::Symbol) at ./none:0
 [3] collect_to!(::Array{Array{Float64,1},1}, ::Base.Generator{Tuple{Symbol,Symbol},getfield(Main, Symbol("##5#6")){MyStruct2}}, ::Int64, ::Int64) at ./generator.jl:47
 [4] collect_to_with_first!(::Array{Array{Float64,1},1}, ::Array{Float64,1}, ::Base.Generator{Tuple{Symbol,Symbol},getfield(Main, Symbol("##5#6")){MyStruct2}}, ::Int64) at ./array.jl:630
 [5] collect(::Base.Generator{Tuple{Symbol,Symbol},getfield(Main, Symbol("##5#6")){MyStruct2}}) at ./array.jl:611
 [6] deepcopy(::MyStruct2) at ./REPL[4]:1
 [7] top-level scope at none:0

I assume that the first deepcopy shown here returning MyStruct1(Float64[]) is indicative of an underlying issue, but what is it doing that the 1D case isn’t throwing an error and how do I rectify this?

Edit: Might it also be possible to have both as “member functions” of the struct, as discussed in discourse here?

The code for deepcopy works for me.

I think you are running into trouble overloading Base.deepcopy. If you try the same code with a different name for the deepcopy function, it works.

I can see a problem because

(a) you perhaps did not import Base.deepcopy
(b) you probably don’t want to change the definition of Base.deepcopy for all objects (which is what your signature implies).

Regarding (a), the problem persists even if explicitly importing Base.deepcopy:

julia> mutable struct MyStruct2
           field1::Array{Float64,1}
           field2::Array{Float64,2}
       end

julia> import Base.copy, Base.deepcopy

julia> Base.copy(x::T) where T = T([getfield(x, k) for k ∈ fieldnames(T)]...)

julia> Base.deepcopy(x::T) where T = T([deepcopy(getfield(x, k)) for k ∈ fieldnames(T)]...)

julia> a = MyStruct2(zeros(1), zeros(1,1))
MyStruct2([0.0], [0.0])

julia> b = copy(a)
MyStruct2([0.0], [0.0])

julia> c = deepcopy(a)
ERROR: MethodError: no method matching Array{Float64,2}()
Closest candidates are:
  Array{Float64,2}(::UndefInitializer, ::Int64, ::Int64) where T at boot.jl:404
  Array{Float64,2}(::UndefInitializer, ::Int64...) where {T, N} at boot.jl:408
  Array{Float64,2}(::UndefInitializer, ::Tuple{Int64,Int64}) where T at boot.jl:412
  ...
Stacktrace:
 [1] deepcopy(::Array{Float64,2}) at ./REPL[4]:1
 [2] (::getfield(Main, Symbol("##5#6")){MyStruct2})(::Symbol) at ./none:0
 [3] collect_to!(::Array{Array{Float64,1},1}, ::Base.Generator{Tuple{Symbol,Symbol},getfield(Main, Symbol("##5#6")){MyStruct2}}, ::Int64, ::Int64) at ./generator.jl:47
 [4] collect_to_with_first!(::Array{Array{Float64,1},1}, ::Array{Float64,1}, ::Base.Generator{Tuple{Symbol,Symbol},getfield(Main, Symbol("##5#6")){MyStruct2}}, ::Int64) at ./array.jl:630
 [5] collect(::Base.Generator{Tuple{Symbol,Symbol},getfield(Main, Symbol("##5#6")){MyStruct2}}) at ./array.jl:611
 [6] deepcopy(::MyStruct2) at ./REPL[4]:1
 [7] top-level scope at none:0

However, (b) appears to be along the right lines:

julia> mutable struct MyStruct2
           field1::Array{Float64,1}
           field2::Array{Float64,2}
       end

julia> deep(x::T) where T = T([deepcopy(getfield(x, k)) for k ∈ fieldnames(T)]...)
deep (generic function with 1 method)

julia> a = MyStruct2(zeros(1), zeros(1,1))
MyStruct2([0.0], [0.0])

julia> c = deep(a)
MyStruct2([0.0], [0.0])

How would I go about embedding these copy functions into the struct or Base such that it can be called elsewhere, akin to the copy function in Pkg (linked above)?

julia> Base.copy(x::T) where T = T([getfield(x, k) for k ∈ fieldnames(T)]...)

Your code still has the issue that @hendri54 mentioned, which is that you are changing the default copy and deepcopy behavior for every type, which is not a good plan:

There is no need for member functions here: you just need to implement copy and deepcopy for your type, rather than for every other type.

Try this in a new Julia session:

julia> mutable struct MyStruct2
           field1::Array{Float64,1}
           field2::Array{Float64,2}
       end

julia> Base.copy(a::MyStruct2) = MyStruct2(copy(a.field1), copy(a.field2))

julia> a = MyStruct2(zeros(1), zeros(1, 1))
MyStruct2([0.0], [0.0])

julia> copy(a)
MyStruct2([0.0], [0.0])

julia> deepcopy(a)
MyStruct2([0.0], [0.0])
1 Like

Yep, and for deepcopy, you should be overloading deepcopy_internal (perhaps unfortunately named) as mentioned in the documentation, though it probably isn’t necessary: the default implementation should work just fine even for these user-defined types.

2 Likes

@rdeits, your code works for this case, however I have ~10 array fields of varying dimensionality, so it would need generalising. Combining your code with the original Base.copy that I was using gives the following:

Base.copy(x::MyStruct2) = MyStruct2([copy(getfield(x, k)) for k ∈ fieldnames(MyStruct2)]...)

This works for the example currently being examined, can anyone comment on whether this looks enough to be equivalent to a “normal” deepcopy?

@tkoolen, I will bear deepcopy_internal in mind for future but I suspect you are right that it is a little overkill for this problem :smile:

Why are you extending deepcopy at all? I.e. what is wrong with the behavior of the fallback / Base deepcopy?

Extending deepcopy is only really necessary when your structures contain unmanaged Ptr fields (they wrap a C library and you want to call its copy/init/free code), or if you must respect aliasing (two fields are different arrays that share a buffer, via e.g. reshape), or if you want to rearrange memory layout (e.g. compactify an array of strings that are splattered all over the heap).

1 Like

It started out as having issues with copy on a struct with an inner constructor, i.e.

julia> mutable struct MyStruct
           field1::Array{Float64,1}
           field2::Array{Float64,2}
           MyStruct() = new()
       end

julia> a = MyStruct()
MyStruct(#undef, #undef)

julia> b = copy(a)
ERROR: MethodError: no method matching copy(::MyStruct)
Closest candidates are:
  copy(::Expr) at expr.jl:36
  copy(::Core.CodeInfo) at expr.jl:64
  copy(::BitSet) at bitset.jl:46
  ...
Stacktrace:
 [1] top-level scope at none:0

I then began searching for the solution to this, came across the SO post linked in OP and went down the rabbit hole from there.

I think the basic point is that Base.copy() is no more special than any other function, so if there is no constructor for MyStruct that takes in field1 and field2, then there will be no way for Base.copy() to construct such a copy for you. The fix is to make sure that if you want it to be possible to create a MyStruct with the fields provided, then you need to make sure such a constructor exists.

That’s kind of what I thought, but then…

julia> mutable struct MyStruct
          field1::Array{Float64,1}
          field2::Array{Float64,2}
          function MyStruct()
              return new()
          end
       end

julia> function MyStruct(x,y)
           m = MyStruct()
           m.field1 = x
           m.field2 = y
           return m
       end
MyStruct

julia> a = MyStruct(rand(3), rand(4,3))
MyStruct([0.451598, 0.454591, 0.0402859], [0.580092 0.30029 0.118246; 0.260241 0.170924 0.386672; 0.78815 0.267166 0.259082; 0.826186 0.407375 0.614876])

julia> b = deepcopy(a)
MyStruct([0.451598, 0.454591, 0.0402859], [0.580092 0.30029 0.118246; 0.260241 0.170924 0.386672; 0.78815 0.267166 0.259082; 0.826186 0.407375 0.614876])

julia> b = copy(a)
ERROR: MethodError: no method matching copy(::MyStruct)
Closest candidates are:
  copy(::Expr) at expr.jl:36
  copy(::Core.CodeInfo) at expr.jl:64
  copy(::BitSet) at bitset.jl:46
  ...
Stacktrace:
 [1] top-level scope at none:0

The funny things is deepcopy works, but copy does not.

I looked at the source code for copy and it just says copy, which I don’t understand at all.

If I understand you correctly, then the following should resolve the issue:

mutable struct MyStruct
    field1::Array{Float64,1}
    field2::Array{Float64,2}
    MyStruct() = new()
    MyStruct(f1, f2) = new(f1, f2)
end

However, this does not appear to solve anything:

julia> a = MyStruct(zeros(1), zeros(1,1))
MyStruct([0.0], [0.0])

julia> b = copy(a)
ERROR: MethodError: no method matching copy(::MyStruct)
Closest candidates are:
  copy(::Expr) at expr.jl:36
  copy(::Core.CodeInfo) at expr.jl:64
  copy(::BitSet) at bitset.jl:46
  ...
Stacktrace:
 [1] top-level scope at none:0

Have I just misunderstood you?

You still need to implement the Base.copy method so that it will call your new constructor.

That was essentially my point.

copy does not seem to have a fallback method for unknown types the way deepcopy does. So you cannot copy tuples, strings, user defined types, …

This seems to contradict the documentation which lists copy(x) as the signature. A bug?

1 Like

Yes it’s intended and no it’s not a really bug.

The absense of the fallback method is intentional since blindly copying something is bad and probably won’t do what you think it’ll do. This is demostrated in your

since the fallback copying for “tuples” and “strings” are no-op. In general, the definition of a shallow copy is a much more application defined concept, unlike deepcopy, which is defined as equivalent to deserializing and serailizing. (with some complication on pointers).

The document says copy(x) only to show the generic intent of the function, which I believe is common practice in the doc. It’s definately not acceptable to mention all signatures that is provided in Base since there are way too many to be helpful. You are certainly welcome to improve the doc though. Mentioning that some types cannot be couplied, maybe with examples, and say that user defiend types won’t be copiable by default.

Thanks for that clarification.

I think it would be helpful to augment the docs (perhaps I can figure out how to do this).