Hi there,
Let’s say I have a dataframe I want to plot. I know that I can use the keyword argument group to select a column and to have a grouping for the different cases. But what if I would like to manually group together different cases, e.g. group cases A and B and cases C and D?
Many thanks in advance.
using DataFrames, Plots
DF = DataFrame(x = 1:4, y = rand(4),cases=["A","B","C","D"]) # These are the plotting data
plot(DF.x,DF.y, group=DF.cases,seriestype = :scatter)
             
            
              
              
              
            
                
            
           
          
            
            
              Welcome!
I think the easiest thing to do here is to make a new grouping column.
df.combined_cases = map(df.cases) do c
    if c in ["A", "B"]
        "A or B"
    elseif c in ["C", "D"]
        "C or D"
    else
        c
    end
end
             
            
              
              
              1 Like
            
            
           
          
            
            
              The answer depends on what you are trying to show. Could you provide a link to an example plot?
You might consider using the layout functionality to generate two subplots. One with A and B, and the other with C and D, e.g.:
plot(plot_ab, plot_cd, layout=(1,2))
But this might not be what you are trying to achieve.
             
            
              
              
              
            
            
           
          
            
            
              Thanks, I guess that would be the sane option. But some subsetting directly at the group argument is not favorable or possible? Something like…
plot(DF.x,DF.y, group=DF[:cases .=="A" || :cases .=="B",:] ,seriestype = :scatter)
…?
             
            
              
              
              
            
            
           
          
            
            
              That’s certainly feasible. But you would lose nice automatic labeling.
             
            
              
              
              
            
            
           
          
            
            
              Thanks. It’s an option, but I guess the least favorable as this maybe somehow deviates from the intended use cases of layout…
             
            
              
              
              
            
            
           
          
            
            
              Why not just write a little helper function? That way you can call it at the function call.
function combine_cases(x)
    combined_cases = map(df.cases) do c
        if c in ["A", "B"]
            "A or B"
        elseif c in ["C", "D"]
            "C or D"
        else
            c
        end
    end
end
plot(df.x, df.y, group = combined_cases(df.cases))
             
            
              
              
              1 Like
            
            
           
          
            
            
              Ah! Of course. Encapsulating. Feels like a good choice. Thanks again!
Had to make a few minor corrections to make it work:
function combine_cases(x)
    combined_cases = map(x.cases) do c
        if c in ["A", "B"]
            "A or B"
        elseif c in ["C", "D"]
            "C or D"
        else
            c
        end
    end
end
plot(DF.x, DF.y, group = combine_cases(DF))