Kind of a newbie question, so if this isn’t the appropriate section I’ll move it.
I was trying to plot a column of a dataframe with some elements that cannot be calculated.
At first I thought of initialize them to 0, and then picking only values different from 0, but it’s kind of contrived, so I was wondering if Plots.jl wouldn’t just ignore y-values if they are of missing type.
As it turns out it can’t, since plotting an array of the type Array{Union{Float64,Missing}} gives TypeError: non-boolean (Missing) used in boolean context.
Am I missing something? I guessed this was the kind of use case for the new missing type… Of course it’s not a big problem, i can use findall or similar functions to trim unwanted values, but if it’s possible wouldn’t it make sense to be able to plot an array with missing elements?
Yes, replacing missing values with NaN works. The only reason that’s not just done internally is that Plots accepts many other input types apart from Float64
OTOH we could do the same as StatPlots @df does: replace by appropriate thing if we know how to replace (strong and float) error on a missing otherwise
But that wouldn’t work for any of the input passed as keyword arguments (the “attributes”). And frankly I think a cleaner approach is to provide first-class missings support in actually wrapping internal calls in Plots that may return missing in skipmissing where relevant. It’s just a bit more work.
Plotly and Matlab, for example, discontinue the line of line/scatter plots when NaNs are present. This makes the fact of missing data visible without disturbing vizualisation of the good data too much. It can be useful, particularly when working with real (not simulated or modeled) data which can have some error condition flagged. Therefore, yes, it would be good if missing/nothing data could be handled within Plots taking this capability of the backend into account.
It’s just that it’s not straightforward to convert missing to NaN, since only Float64 vectors support NaN values. It could be done, as @piever said above, by replacing with NaN on Float64 input, “” on String input etc. But I’d say that given that missing is defined in Base, the cleanest approach is to propagate the missing values on to the backend plotting package and let that decide.
Plots ignore NaNs to plot lines, but when I used scatter and the attribute z = marker_z to color the dots, if my z has NaN elements, it is still going to plot a black dot.
One way to go around this is for x or y to be = NaN, but that doesn’t seem very elegant.
Is there a way to make NaN in marker_z to not plot the dot?