StatPlots groupedbar order x axis

Hi,

Is there a way to order the x-axis bars using groupedbar? Currently it is placing the individual bars alphabetically, but I would like to specify the order manually. I have tried using plot!(xticks=(1:10, labels)), but that only changes the actual labels and doesn’t move the bars…

2 Likes

No, and you can’t specify the order of the individual sub-bars either. But you should be able to do that IMHO. If you can think of a good set of keywords and interface to specify this (perhaps group could take a tuple where the second arg is the group order, and there could be an xorder argument? not sure) you should open an issue on StatPlots and we’ll implement this. Perhaps @piever has a good idea, he’s thought a lot about these things.

I though categorical arrays used to work for this? Normally to specify the order of your data you can put your array into a CategoricalArray (see here). I think in that case Plots should sort by the custom order.

That works for the groups, but unfortunately not for the x argument. I guess the solution is to implement that in StatPlots and add it to the readme.

Yes that needs fixing: if the user specifies an order with the CategoricalArray we should respect that in the x axis: am a bit confused as to why it doesn’t work already, but will investigate.

I see you added the issue, thanks!

Yes, perhaps this needs fixing in Plots rather than StatPlots? Even if it means taking on a dep on CategoricalArrays (but that shouldn’t be necessary, no?)

I also think this should be fixed on Plots side but it shouldn’t need extra dependency: we should instruct Plots to respect the order of the custom type before converting to string / float

I’m actually no longer sure I completely understand what you are trying to achieve, could you post the code you’re using to generate the plot?

I’m a bit confused because I don’t think the order used by default for the x axis is alphabetical so I wanted to make sure I understand the issue.

Related to this, we could probably allow to use groupedbar(["a", "a", "b", "b"], [1,2,3,4]) to give the same plot as groupedbar(["a", "b"], [1 2; 3 4]), currently it errors.

Agreed.

Sure, so in this example I would like to have the x-axis arranged B, C, A (as an arbitrary example). By using xtick=(…,…) it changes only the actual labels, it doesn’t change the columns. So the column should move with the label.

using Plots
using StatPlots
using GroupedErrors
using DataFrames
gr()

df = DataFrame([[“A”,“B”,“A”,“B”,“A”,“B”,“C”,“C”,“C”], [1,2,3,2,3,4,6,6,8]], [:Observation, :Count])
@> df begin
@across :all
@x _.Observation :discrete

@y _.Count
@plot groupedbar(ylabel="Fraction of proteome", legend=true)

end

Does this make sense?

Minor remark: use the </> button to format your code, the " button makes a quotation and it looks a bit funky.

Interestingly enough GroupedErrors already does the correct thing and you can use CategoricalArrays:


using Plots
using StatPlots
using GroupedErrors
using DataFrames
using CategoricalArrays
gr()
v = CategoricalArray(["A","B","A","B","A","B","C","C","C"])
levels!(v, ["B", "C", "A"])
df = DataFrame([v, [1,2,3,2,3,4,6,6,8]], [:Observation, :Count])
@> df begin
    @across :all
    @x _.Observation :discrete

    @y _.Count
    @plot groupedbar(ylabel="Fraction of proteome", legend=true)
end

See the CategoricalArrays docs for more details. I think you should also be able to make a column categorical from the DataFrame directly, something like:

df = DataFrame([["A","B","A","B","A","B","C","C","C"], [1,2,3,2,3,4,6,6,8]], [:Observation, :Count])
categorical!(df, :Observation)
levels!(df[:Observation], ["B", "C", "A"])

Solving the thing in general for StatPlots is a bit trickier: should groupedbar(x, y, group=z) return the x axis in the order it is encountered or should it sort it, like boxplot and violinplot do? Maybe we could sort it.

I almost think plotting commands where sorting is possible should take the same keyword arguments as sort (rev, by) to allow reversing the order or some custom sorting.

Okay, this fix works for GR, but when using PyPlot it doesn’t show the actual x-axis labels (see the image, although I can manually fix this using the xticks attribute). For both Plotly and PlotlyJS it just crashes. Thanks for all your help so far!

This sounds like a bug on Plots side, could you open an issue over there?

Will do!

Is there an update on this? I’m having the same sorting issue of the x axis with grouped bars on StatsPlots. It doesn’t seem to be an issue on Plots itself, but couldn’t figure out how to plot grouped bars with Plots using any of the backends.

Hi,

so I just figured out how to do it after a lot of digging.

The issue is that in
RecipesPipeline.jl there is
group_labels = sort(collect(unique(v)))

Now in theory there are CategoricalArrays for this purpose,
however unique(ctg::CategoricalArray) returns an ordinary Vector{String} instead of a new CategoricalArray.

Defining this solves the problem in this case

function Base.unique(ctg::CategoricalArray)
    l = levels(ctg)
    newctg = CategoricalArray(l)
    levels!(newctg, l)
end
ctg = CategoricalArray(["a", "b", "c", "a", "b", "c"])
levels!(ctg, ["c", "b", "a"])
groupedbar(1:6, group=ctg)

test

10 Likes

so I just figured out how to do it after a lot of digging.

  • Thank you!

@JonasIsensee Just so as you know, in response to this similar question on GitHub for StatsPlots.jl, I quickly wrote up your method into a function which adapts the usual inputs to StatsPlots.groupedbar, meaning that the ordering of the names and categories is preserved.

I credited you directly but I wasn’t able to tag you on GitHub, so I thought it would be kind to let you know.

Have a nice day!