How to avoid Any in list comprehensions with small unions

Christopher_Fisher · June 8, 2019, 8:55am

Hi all-

I have a situation where I want to generate one of two types of arrays from a list comprehension. In the simplest case, all of the elements are Float64. This is straightforward and always works. In the other case, the elements are Union{Float64, Array{Float64,1}}. However, in practice, Julia defaults to Any in the later case rather than the small Union described above. Here is a MWE:

Case 1:

 x = [.3,.4]
  2-element Array{Float64,1}:
 0.3
 0.4

Case 2:

 x = [rand(2),.4]
  2-element Array{Any,1}:
  [0.929281, 0.277345]
 0.4

Of course, if I use Union{Float64,Array{Float64,1}}[x for x in X], it forces a union in both cases, which I want to avoid.

Is there a way to produce Array{Float64,1} in the first case and Array{Union{Float64, Array{Float64,1}},1} in the second case?

tkluck · June 8, 2019, 9:32am

I don’t think there is a way. This uses promote_typejoin internally. It has a special case for nothing and missing to give a small union, but for all other things, it will fall back to typejoin, which gets you either a concrete type or Any in all cases that I can think of.

What benefit are you hoping to get from this? If it is performance, could you show a benchmark where this makes a difference? We may be able to advice you to get your objective in another way.

Christopher_Fisher · June 8, 2019, 9:50am

Thanks for your reply. I was afraid that this might be the case. I have not ran a benchmark, but since it is a critical part of my code, I am following general advice to avoid abstract containers. My understanding is that a small union can mitigate this issue to some degree, but it’s still slower than a container of a single type. This is why I am trying to handle both cases.

I’m not quite sure how I can incorporate promote_join in to my code. Do you have an example? As an alternative, perhaps I can wrap the list comprehension in a function and use dispatch to deal with the two cases. That should work well enough. It would be nice if Julia created a small union (e.g. 2 or 3 types) by default and Any[] could be used to override the default. I suppose this may have been considered already, but was rejected due to some issue that is not apparent to me.

Christopher_Fisher · June 8, 2019, 9:56am

This might be useful for someone who comes across this post.

tkluck · June 8, 2019, 10:08am

Measuring what you are doing is at the very top of the general performance advice, above avoiding abstract containers. I highly recommend it: in my experience, the first order of magnitude is usually cut by something completely silly that you wouldn’t have thought of without profiling.

Christopher_Fisher · June 8, 2019, 10:19am

You are right. I should have ran some benchmarks prior to opening the thread, but unfortunately, I cannot share the code, at least for now. So it may not have been that helpful. I did just run some tentative benchmarks, which suggest Any will probably cause performance issues in my larger code base. Dispatching on the two cases should be feasible. So hopefully that helps someone else who encounters this issue. Thanks again for your help!

Topic		Replies	Views
Why doesn't `vect` return a union-typed result if the number of elements is small? Internals & Design performance , inference , type	3	615	September 13, 2022
Array of type Any vs Union General Usage question	5	3924	March 22, 2017
Type promotion: Why not use more unions? Internals & Design question	7	1384	February 2, 2018
Small union type failure General Usage	15	1173	April 15, 2021
Why are irregularly shaped arrays of arrays of type Any in Julia? New to Julia question	8	713	December 20, 2021

How to avoid Any in list comprehensions with small unions

Related topics