I am trying to plot boxplots of values with respect to 4 categories represented as numbers. The four categories are powers of two: 16384, 32768, 65536, 131072.
Let’s use this MWE dataframe:
using DataFrames, StatsPlots
df= DataFrame(cat = [16384, 32768, 65536, 131072, 16384, 32768, 65536, 131072, 16384, 32768, 65536, 131072], val = rand(12))
There are two possibilities, first I can plot without any change
boxplot(df.cat, df.val)
But as you can see on this image the boxes are not evenly spaced but the categories are numbers
But if I transform into strings I get nice boxes, but then the values are sorted alphabetically and not numerically.
boxplot(string.(df.cat), df.val)
I can’t find a way to make order this correctly. Anyone knows ?
I think you are looking for the xdiscrete_values
keyword, e.g.:
using DataFrames, StatsPlots
df= DataFrame(cat = [16384, 32768, 65536, 131072, 16384, 32768, 65536, 131072, 16384, 32768, 65536, 131072], val = rand(12))
boxplot(string.(df.cat), df.val, xdiscrete_values=string.(df.cat))
2 Likes
YES that’s exactly what I needed thank you ! There’s a lot of keywords and aliases in the package, it’s hard to find what you need in the docs sometimes.
Where did you find it by the way ? I don’t see it in the attributes section of the documentation.
Yes I can sympathize with that sentiment; some of the Plots.plot()
arguments could be better documented.
The xdiscrete_values
keyword argument is at the bottom of the gr() backend documentation page. While this page does not describe how each argument affects the plot, I use it as a starting point if I need to explore new functionality.
1 Like