Grouping in Julia plots

Hi there,

Let’s say I have a dataframe I want to plot. I know that I can use the keyword argument group to select a column and to have a grouping for the different cases. But what if I would like to manually group together different cases, e.g. group cases A and B and cases C and D?

Many thanks in advance.

using DataFrames, Plots
DF = DataFrame(x = 1:4, y = rand(4),cases=["A","B","C","D"]) # These are the plotting data
plot(DF.x,DF.y, group=DF.cases,seriestype = :scatter)

Welcome!

I think the easiest thing to do here is to make a new grouping column.

df.combined_cases = map(df.cases) do c
    if c in ["A", "B"]
        "A or B"
    elseif c in ["C", "D"]
        "C or D"
    else
        c
    end
end
1 Like

The answer depends on what you are trying to show. Could you provide a link to an example plot?

You might consider using the layout functionality to generate two subplots. One with A and B, and the other with C and D, e.g.:

plot(plot_ab, plot_cd, layout=(1,2))

But this might not be what you are trying to achieve.

Thanks, I guess that would be the sane option. But some subsetting directly at the group argument is not favorable or possible? Something like…

plot(DF.x,DF.y, group=DF[:cases .=="A" || :cases .=="B",:] ,seriestype = :scatter)

…?

That’s certainly feasible. But you would lose nice automatic labeling.

Thanks. It’s an option, but I guess the least favorable as this maybe somehow deviates from the intended use cases of layout…

Why not just write a little helper function? That way you can call it at the function call.

function combine_cases(x)
    combined_cases = map(df.cases) do c
        if c in ["A", "B"]
            "A or B"
        elseif c in ["C", "D"]
            "C or D"
        else
            c
        end
    end
end

plot(df.x, df.y, group = combined_cases(df.cases))
1 Like

Ah! Of course. Encapsulating. Feels like a good choice. Thanks again!

Had to make a few minor corrections to make it work:

function combine_cases(x)
    combined_cases = map(x.cases) do c
        if c in ["A", "B"]
            "A or B"
        elseif c in ["C", "D"]
            "C or D"
        else
            c
        end
    end
end

plot(DF.x, DF.y, group = combine_cases(DF))