I have a function which takes data loaded from a file. Normally it’s all Matrix{Float64} but sometimes there are some missing values, Matrix{Union{Missing, Float64}. How would I write a function to take a Number matrix with or without missing values?
Here is my failing attempt.
function missingOrNot(data::Array{T}) where T <: Union{Missing, Number}
if isa(data,Array{Union{Missing, Number}})
println("Numeric Matrix with missing values")
elseif isa(data,Array{Number})
println("Numeric matrix")
end
end
julia> y = [NaN 2 3 4;5 6 NaN 8;9 10 11 12]
3×4 Matrix{Float64}:
NaN 2.0 3.0 4.0
5.0 6.0 NaN 8.0
9.0 10.0 11.0 12.0
julia> z = [NaN 2.0 3.0 4.0;5.0 6.0 missing 8.0;9.0 10.0 11.0 12.0]
3×4 Matrix{Union{Missing, Float64}}:
NaN 2.0 3.0 4.0
5.0 6.0 missing 8.0
9.0 10.0 11.0 12.0
julia> missingOrNot(y)
julia> missingOrNot(z)
The docs were my first destination and that page specifically but must have missed the info.
The important warning there is that Array{Float64} is not a subtype of Array{Number} . But, it is a subtype of Array{T} where {T <: Number}
Just out of curiosity, is there a way to do that within the function? How would I fix the isa functions in my example? if isa(data,Array{Union{Missing, Number}})
@pixel27 This is how I ended up changing the function definition in my example: function missingOrNot(data::Matrix{T}) where T Union{S, Union{S, Missing}} where S <: Number
function missingOrNot(data::Matrix{T}) where T Union{S, Union{S, Missing}} where S <: Number
if isa(data,Array{<:Number})
println("Numeric matrix")
elseif isa(data,Array{T} where {T <: Union{Missing, Number}})
println("Numeric Matrix with missing values")
end
end
Probably, but I think it’s more broad than I want. My goal was to keep it just as specific as necessary without being overly specific. I use filter and !isnan to strip missing and NaN from my data sets.
This is also a great opportunity to refine my understanding of the topic.
Maybe I misunderstood the question. Is this useful:
function missingOrNot(data::AbstractArray{T}) where T <: Union{Missing, Number}
if Missing <: T
if any(ismissing, data)
println("Numeric Matrix with missing values")
else
println("Numeric Matrix that could hold missing values, but doesn't.")
end
else
println("Numeric matrix")
end
end
That is a more nicely written alternate to an idea I had. I used multiple dispatch because I figured there would be less overhead, any presumably has to search each value and I am working with a week of data at 10Hz. In my case, if my memory serves me, I used CSV.File to load data. If there are no missing values it returns a Matrix{Float64} otherwise it returns a Matrix{Union{Missing,Float64}} so it’s relatively simple to sort that way if I could get the dispatch correct.
I really appreciate the tip everyone is posting here though.
Now that is something I hadn’t realized, skipmissing doesn’t return an array without missing values but an iterator. It’s already in use in my code but together with filter. That’s very handy.