I have a function for which one of the arguments may be either an array or a tuple, though its usage internally is tuple-like. If x is an array that then I will have the line y = copy(x)
but if x is a tuple, copy will not work so I need to simply say: y = x
The problem is: how can my code distinguish the two? I tried using typeof(), but that does not return a string, so I do not know how to interrogate the result of typeof(). (I looked at Types · The Julia Language but I cannot find anything there on how to process the output of typeof()).
EDIT: never mind, a “splat” seems to work on either: y = tuple(x...)
EDIT 2: So I will delete this in a while unless someone asks to keep it.
Why do you need to copy the array? Can’t you just write y = x in either case? I mean, if you plan to mutate y that’s maybe not a good idea, but you cannot mutate tuples, so I presume that you’re not going to do that anyway.
Good point, I had a blind spot. I knew the following code gives nasty side effects in the calling function:
function f(x, y, some_more_params)
if some_condition
x[1] = y[1]
end
do_something
end
But this (which is what I wanted to do) is actually safe. My “blind spot” is that I forgot it was safe, as I was telling myself “never modify an array argument inside a function”:
function f(x, y, some_more_params)
if some_condition
x = y
end
do_something
end
Still, now that I have forced x to be a tuple, I feel I might as well leave it in my code, which now looks like this:
function f(x_initial, y, some_more_params)
x = tuple(x_initial...) # a "splat" to force a tuple
do_something
if some_condition
x = y
end
do_something
end
(EDIT: And in terms of readability, I think it is helpful to distinguish between the default/initial value of x, and the “current” value of x).
There’s nothing inherently wrong with modifying an array inside a function (almost everything should be happening inside functions anyway), as long as you are deliberate about it. It’s one of the most common performance techniques. Just remember to put a ! at the end of any function that modifies inputs, to make that clear to the caller.
There is actually quite a number of issues/PR related to copying immutables:
implement copy(::Void) = nothing (closed PR, contains an interesting discussion related to “alias-stability”)
If you have copy(f(x)), and f returning nothing is a problem, then failing in copy catches the problem sooner than having copy propagate the value. I think if some code is bothering to copy something, it’s expecting a mutable collection. (Jeff)
I’ve tried to avoid defining copy for immutable objects. How did this arise? My view is that code calling copy without knowing whether the argument is mutable can’t really be generic. (Jeff)
If you’re copying something, then it must be because you’re going to mutate it—otherwise why copy it? But if some of the values are immutable, then that’s going to fail anyway, so what kind of code needs this? (Stefan)
I guess it gets problematic if you e.g. put this into a library which is used by someone who finds and uses the type pirated method and doesn’t know it’s actually not in Base. That might cause some headaches when in another ensemble of packages this particular method is magically missing although it’s a Base-method dispatching on Base-types.
I have to say that I disagree with the founders (i.e., Jeff and Stefan) on this one (what means I am probably wrong, XD). My reasoning is the following:
deepcopy is defined by default for both mutable and immutable types. What is the reasoning for having a distinct policy for copy and deepcopy? Considering that deepcopy can be as misleading, my first PR to a Julia library was exactly making deepcopy raise an exception on JuMP.Model because it does not copied a C pointer resource so it was not a true deepcopy.
It is just a more elegant way of doing isimmutable(x) || (x = copy(x)) or y = isimmutable(x) ? x : copy(x).
As pointed by (2) it is just a matter of convenience, not defining copy will do little but make programmer scratch their head and sometimes they will be happy that some bug did not propagate, others they will just lose some time thinking if it is better to put an if, or define copy for some immutable type, or just change statement order, and if they did assume wrong that copy would not raise exception over immutable types and now they need to check their code for it.
As copy is not defined for default for immutable types but also is not guaranteed to not be defined for immutable types, programmers cannot assume that arbitrary immutable types in the wild will or not provide them.
Programmer making use of type piracy to ‘inoffensively’ create copy methods for convenience can change the error that is raised in some situation, making harder to identify that is the same problem as others had.
The current state is not completely coherent either, Int, Float64, and others have copy defined even being immutable types (and will raise different errors when passed to a method that often throws on copy for arbitrary immutable types).
Again, not the strongest case, but I feel like it was just a bit more convenient if copy and deepcopy had the same policy. Not the sufficient, probably, for meriting the change now they are already this way.
I wouldn’t recommend this for code where you care about performance, as it will be slow both for arrays and for longer tuples.
This code will perform badly for arrays because the output type depends on the number of elements of the array, so the compiler won’t generally be able to figure out the type of x. And it will perform badly for long tuples because splatting large numbers of arguments (more than I think 16) will generate less efficient code. That’s due to an intentional trade off in the compiler to avoid spending a long time generating code for functions with huge numbers of arguments.
As you’ve already discovered, doing x = y seems to be totally fine for your use case, and it will perform better too. So I’d recommend just doing that