Avoiding deepcopy() with functions using union of types as input

Hi,

I have two custom defined types that I use in my functions, i.e. m::Union{T1,T2}. T1 and T2 have some overlaps with respect to field names and number of fields. I want my functions to deliver a new version of etiher T1 or T2. I don’t want mutating functions here. How to avoid deepcopy() in these cases? If possible, I would avoid multiple dispatch here, i.e. just one function for both T1 and T2.

It’s a bit unclear to me why you’re using deepcopy here – just to get a new instance of T* or because you want to keep some of its fields around? If it’s just about figuring out the type, you can use f(x::T) where T<:Union{T1,T2} = T(...). If the latter, then maybe add a constructor like T1(t::T1; a=nothing, b=nothing) = T1(something(a, t.a), something(b, t.b)) or just use Accessors.jl.

If possible, I would avoid multiple dispatch here, i.e. just one function for both T1 and T2.

Worth noting that dispatch is going to happen regardless, you just save typing a similar function body twice.

1 Like

It’s a little tough to understand exactly what you’re asking. Unless the above reply solved your question, you might try giving a short self-contained example of what you’re trying to do and what you want to happen.

1 Like

Okay, it didn’t occur to me that Julia allows multiple dispatch even on constructors… wow. Thanks!

Okay, maybe I still need some more help here. So my functions tend to look like this:

function dosomething(m::Union{T1,T2}, param::Int)
p1 = dostuff(m.param1)
p2 = domorestuff(m.param2)
res = deepcopy(m)
res.param1 = p1
res.param2 = p2
return res

I would make a function with methods like

updateparam1(m::T1, newvalue) = T1(newvalue, m.param2)
updateparam1(m::T2, newvalue) = T2(newvalue, m.param2, m.param3) # maybe T2 has more params

If you pass a T1, it will produce a new T1 with the new value while using the existing values for the other parameters (you can add copy to those if you want). Likewise for T2. If you wanted, you could alternatively define them to mutate the input instead (if you copied it earlier or wanted to mutate the object you passed).

You can define functions that update multiple parameters at the same time in the same way. So you might replace the deepcopy and param update lines above with something like res = updatep1p2(m, p1, p2) where dispatch determines which version of updatep1p2 is called.

I see, thanks.

Now what if my type had a lot of parameters?

Or if at some point in the development I decide that one of these two types should have other parameters? Would I have to change all the functions I use?

You can use kwargs and look for the keys you want.

Your overall problem seems very similar to a bunch of problems i had to solve.

I did it by defining what i called factory functions. Here is an example using the computation of some summary statistics.

The factory gets called here:

Then each concrete subtype of AbstractPriorEstimator defines its own factory function that returns the estimator with the new observation weights.

Eventually the chain goes all the way down the hierarchy, at which point it calls its constituent factories.

Which in turn call the factories of its constituent pieces until it reaches the most basic component like this

I used this pattern everywhere, and it can have generic fallbacks when it is not appropriate to modify a field which doesn’t exist.

Even for vectors of structures, where the factory is called for every entry in the vector.

For example these definitions

Those types could be individually provided or in a vector, and the factory function would still be invoked with the same function signature

My code is riddled with these factories, but i’ve got in the habit of defining them for all the types whose functionality requires it. It’s the price to pay for immutability. But it greatly simplifies code validation logic as i only have to validate at structure construction time, which guarantees my values remain valid unless i do something deliberate like expanding an array or broadcasting into one.

These are examples where i set the values manually, but it’s also possible to do so programatically. Here i take the property whose bounds i want to remove, and make a named tuple with the other properties names and values to create an unbounded struct.

Thanks for this. Unfortunately I’m not very seasoned so I don’t really understand the approach you used. But I noticed that using Setfield.jl (and res = @set...) gives a very modest performance increase and less allocations than deepcopy().

Accessors.jl is probably the one you want to use. It’s more accessible than Setfield.jl

I tried but I have a DataFrame in one of the fields and it didn’t like that.