Create histogram with missing values

I want to make a histogram with the data in a column of a data frame; the column has a lot of missing values. I don’t want to drop the missing values in the data frame or create a new array without missing values. How do I program a macro to create the histogram?

You can do so using skipmissing:

using StatsPlots

x = [4, 5, 2, 7, missing, 8]

histogram(collect(skipmissing(x)))
2 Likes

But does any of the above comply with OP request of?

The dealing of missing's are a real mystery to me. How are vectors and matrices containing them dealt with? We cannot have a matrix with missing nodes, however:

sizeof(missing)
0

which is obviously impossible. Even neutrinos have mass, and anything carrying information cannot be stored … and occupy no space.

2 Likes

Yes, maybe. But somewhere it must be stored the info of data memory chunks addresses because we cannot have down in the basement an array stored as
[8bytes, 8bytes, 0, 8bytes, …].

And confess that what puzzles me more is how an image with missings can be stored as a 3D array in which some nodes occupy zero space.

1 Like

Thanks

I guess sizeof doesn’t include the type tag. Since there is only one value of type Missing, there is no need to store any more information about it once we know the type.

The definition of Plots recipes can help to handle histograms with missing values, especially the type recipes:

using DataFrames, Plots

df = DataFrame(a = [randn(99); missing; randn(99) .+ 2; missing; randn(99) .+ 4])

@recipe function f(v::Vector{Union{Missing, Float64}})
    seriestype --> :hist
    y = filter(!ismissing, v)
    1:length(y), y
end

histogram(df.a)   # it should now work for Vectors with missings
1 Like