Why does Array{Int}() product Array{Int, 0} and not Array{Int, 1}

I would expect a Union{Nothing, Array{T, N}} to be optimized out in v0.7, especially if you branch anyway. In cases where it isn’t, a function barrier should take care of it, making the code also cleaner (although that is subjective; I happen to like small functions that do one thing only).

Is is not surprising since any element of an N-dimensional array A can be accessed by A[a_1, a_2, ... a_N, 1, 1, ..., 1], note the arbitrary number of ones at the end.

For the specific case of a zero dimensional array, then, it has at least the A[1, 1, ... 1] element, which can be seen as the scalar version of the multidimensional arrays.

For short, a 2-dims array is a matrix, a 1-dim array is a vector and a 0-dims array is a scalar.

The union is optimized out in 0.7 but I’m not sure I understand the objection. Even if there are other options, it is still a valid use case.

My objection is to introducing special constructors for this. Here the suggested use case is a sentinel value that fits in the type, once type stability is not a strict constraint other sentinel values are fine too, and maybe more conventional.

Union{Missing, T} is not 100% as fast as just using T, causes more allocations, and is not compatible with C code:

julia> struct Foo
       x::Int
       end

julia> struct FooM
       x::Union{Missing,Int}
       end

julia> function fillfoo(T,n)
       ret = Array{T}(undef, n)
       for k = 1:n
       ret[k] = T(1)
       ret
       end
       end

julia> @time fillfoo(Foo, 100000);
  0.046550 seconds (199.50 k allocations: 3.808 MiB)

julia> @time fillfoo(FooM, 100000);
  0.049624 seconds (299.50 k allocations: 8.385 MiB)

What you are measuring is sample noise. Both produce identical code in v0.7.

Look at the allocations. They are not identical.

Naturally, since

julia> sizeof(Foo)
8

julia> sizeof(FooM)
16

as the extra information (the concrete type in the Union) needs to be represented.

So in high performance settings an empty array is twice as good as using Missing .

I don’t quite understand what you mean by “twice as good” — while memory allocation does affect performance, it does not translate linearly.

Also, I don’t understand how this came up in this topic. In this example, one would not allocate a container at all, and in situations when one is needed, the memory cost would be a small fraction for the array (if it has elements).

Finally, in v0.7,

julia> struct A{N,T}
           inner::Array{N, T}
       end

julia> struct AM{N,T}
           inner::Union{Array{N, T}, Missing}
       end

julia> a = randn(10, 10, 10);

julia> Base.summarysize(A(a))
8056

julia> Base.summarysize(AM(a))
8056

so the compiler/memory management is now getting super-clever.

I don’t quite understand what you mean by “twice as good” — while memory allocation does affect performance, it does not translate linearly.

Performance isn’t everything: memory usage matters.

the memory cost would be a small fraction for the array

Depends on how many arrays there are.

Also, you ignored the other issues I raised (compatibility with C code). In any case, I think style-wise it’s a bad idea to use Union{Missing,T} when it’s only missing temporarily.

Probably this is where we disagree. I think that using nothing or missing instead of a sentinel value makes the code more readable (although I fully recognize this is subjective). The only exception is when the sentinel values are extremely well-established, such as NaN.

The issue is the type is user-facing: it’s misleading to the user to say it may be of type Missing if it’s only temporarily missing while setup.

Or another user case that comes up a lot is when your array is dynamically resized (I have a CachedArray type floating around that does this with lazy arrays). The natural initial size is 0 x 0: this is not to indicate it’s missing, just to indicate it’s currently empty.

Also, while Union{T,Missing} might be efficient in many settings, this is a compiler optimization trick that as a user we do not and can not appreciate the full impliciations of.