X::Vector{Union{Missing, Any}}


#1

I am trying to write a function that would work for any vector containing missings. However, it seems that

f(X::Vector{Union{Missing, Any}}) = print("hi")

doesn’t work. Could someone enlight me on how to build functions dealing with missing the Julian way :smile:? Thank you


#2

This does not completely answer your question, but note that

julia> Union{Missing, Any}
Any

and thus

julia> Vector{Union{Missing, Any}}
Array{Any,1}

so your function only accepts objects with type Array{Any,1} (Vector{Any}). Note also that

julia> Vector{Int} <: Vector{Any}
false

since Julias type parameters are invariant, see relevant manual section: https://docs.julialang.org/en/v1/manual/types/#Parametric-Composite-Types-1 and thus

julia> f(x::Vector{Any}) = println("hi");

julia> f(Any[1, 2]) # typeof(x) == Vector{Any} so this works
hi

julia> f(Int[1, 2]) # typeof(x) == Vector{Int} so this fails
ERROR: MethodError: no method matching f(::Array{Int64,1})

#3

I see, so if I undestand correctly, there are no specific type for “missing” that would not be “any”? In other words, “any-with-missing” and “any-without-missing” are not dissociated?


#4

Indeed, since (as its name indicates) Any can be any value, including missing.

In general, use ::AbstractArray{>:Missing} to define a method which should be called for any array which can contain missing. But note that in many cases you can just define a single generic method, knowing that the compiler will (often) optimize out ismissing(A[i]) calls when the array cannot contain missing.


Use cases for type lower bounds
#5

This aspect of Julia is still a bit unclear to me; currently how I’ve been doing is, for instance, defining one method for vector of numbers:

function f(X::Vector{<:Number}; n::Int=10)
    X = collect(range(minimum(X), stop=maximum(X), length=n))
end

f([1,6, 4])
f([1,6, missing])

(which doesn’t work for the vector containg missing values.)

Thus, I create a second version:

function f(X::AbstractVector{>:Missing}; n::Int=10)
    X = f(collect(skipmissing(X)), n=n)
end

f([1,6, missing])

which removes the missings and performs the function. However, I feel like it’s not the most optimal or Julian way of doing it…


#6

I’d recommend just writing one method:

function f(X; n::Integer=10)
    X = skipmissing(X)
    return collect(range(minimum(X), stop=maximum(X), length=n))
end

#7

And not to restrict to Numbers? What if I want the same method doing something else for Strings for instance?


#8

Sure, you can write f(X::AbstractVector{<:Real}; n::Integer=10) if you want, but often it is not necessary to specify types. (If you want to do something completely different with strings, then it is possible to do that with a more specific method, but maybe it would be better to create a new function entirely?)

Number includes complex numbers, which do not support maximum or minimum.

On the other hand, if somebody creates a non-number type which supports maximum, minimum and range, then why not allow it in your function?


#9

How about this?

julia> f(x::AbstractVector{<:Union{<:Number, Missing}}) = 1
f (generic function with 1 method)

julia> f([1,2,3])
1

julia> f([1,2,3,missing])
1

julia> f(["hello", "world"])
ERROR: MethodError: no method matching f(::Array{String,1})
Closest candidates are:
  f(::AbstractArray{#s12,1} where #s12<:(Union{Missing, #s13} where #s13<:Number)) at REPL[1]:1
Stacktrace:
 [1] top-level scope at none:0


#10

::AbstractVector{<:Union{Number, Missing}} is the way to go if you want numbers or missing values. But it can also make sense to define two methods instead, depending on what the function does. Note that in the example you provide, the call to collect around skipmissing isn’t needed since maximum accepts any iterable: if you define an internal method which takes any iterable too, and call it from the two public methods, you can avoid that unnecessary allocation.


#11

I think I understand now :slight_smile:

Many thanks for your awesome explanations and clever suggestions, it helps a lot!