Groupedhist with missing values

Hi,
I have a dataframe df with several columns, among which :P1, whose eltype is Union{Missing,Float64} and :Nome_Turma, whose eltype is String. I want to make a groupedhist, from StatsPlots.jl (or otherwise), but I get:

@df df groupedhist(:P1, group=:Nome_Turma)
ERROR: TypeError: non-boolean (Missing) used in boolean context

I know my column df.P1 does have missing values. Might this be the problem? If so, how should I correct it? Shouldn’t the command groupedhist do this automatically?

Probably

@df dropmissing(df, :P1) groupedhist(:P1, group = :Nome_Turma)

Excellent and on spot!
Another, perhaps unrelated, question is: how could I generate the histograms in distinct (sub)plots?

p1 = histogram(df[df.Nome_Turma .== some_value, :P1])
p2 = histogram(df[df.Nome_Turma .== other_value, :P1])
plot(p1, p2)

or something like

plot([histogram(df[df.Nome_Turma .== x, :P1]) for x in unique(df.Nome_Turma)]...)

if there’s lots of values

Wow! Thank you so much!!!

Perhaps I should create another post, but here it goes: What if I wanted to use the argument :P1 as a variable inside a loop, such as, e.g.

etapas = [:P1, :P2, :P3, :SC]
for etapa in etapas
  ghist = @df dropmissing(df, etapa) groupedhist(etapa, group = :Nome_Turma)
  savefig(ghist, "ghist_$etapa")
end

It issues an error:

ERROR: MethodError: no method matching groupedvec2mat(::Dict{Int64, Int64}, ::Vector{Int64}, ::String, ::RecipesPipeline.GroupBy, ::Float64)

Yeah that won’t work because @df is a macro which can only see the actual code written, not any runtime values. This has nothing to do with the dropmissing, you just can’t write @df df groupedhist(etapa, group = :Nome_Turma) as the macro will expand this using the actual string etapa rather than the value of that variable.

From the JuliaPlots/StatsPlots.jl site I thought

using StatsPlots
etapa = :P3
@df dropmissing(df, cols(etapa)) groupedhist(cols(etapa), group = :Nome_Turma)

should work but the following error was issued:

ERROR: UndefVarError: `cols` not defined in `Main`

Any suggestions?

That doesn’t change the fact that you want the macro to understand the runtime value of a variable, which just isn’t possible. Macros are just convenience to rewrite code, and the @df macro needs an actual Symbol in the written expression it transforms to work. See these two examples (where I’ve replaced your column names P1 with b and Nome_Turma with group):

julia> prettify(@macroexpand(@df df groupedhist(:b, group = :group)))
:(((fly->begin
          ((penguin, locust), grasshopper) = (StatsPlots).extract_columns_and_names(fly, :b, :group)
          (StatsPlots).add_label(["b"], groupedhist, penguin, group = locust)
      end))(df))

julia> prettify(@macroexpand(@df df groupedhist(etapa, group = :group)))
:(((fly->begin
          ((penguin,), locust) = (StatsPlots).extract_columns_and_names(fly, :group)
          (StatsPlots).add_label(["etapa"], groupedhist, etapa, group = penguin)
      end))(df))