I want to annotate the statistical significance of the difference between a pair of boxplots on my figure, as one of 'n.s.', '*', '**', '***'. An example image of what I want is below, the best analogue I can find of what I want is ggsignif in R: https://github.com/const-ae/ggsignif.
Does a functionality like this exist in StatsPlots.boxplot?
If no functionality exists, I can easily get one of 'n.s.', '*', '**', '***' using the p-value for a given hypothesis test in HypothesisTests. However I’m struggling with annotating the plot correctly - I could use annotate!(x, y, text("*", :centre, 8)), but I’m not sure how I’d know which x, y to pick to place the text correctly above the box? Anyone have any suggestions?
Wouldn’t a color with legend be more intuitive for people who never came across this “symbol jargon”? I use boxplots for a while and never came across *** ns, etc on papers.
Yeah colour’s a really nice idea, I’ll look at that. Unfortunately this notation is the standard in my field (developmental biology), so I can’t really dismiss it.
Thanks, and how could I get a second bar like this one in red?
Your solution’s great for annotating a comparison with the highest box, but I’d like to be able to annotate these bars at a consistent height above each box. Can I access the y values for the whiskers in each box?
Not really, if boxes virginica and setosa are adjacent, I’d like the line the same height above the tallest whisker of the two, as the height above the whisker for versicolor.
Basically, for any pair of boxes b1, b2, I want the line to be at height max{whiskers(b1), whiskers(b2)} + dy, for some dy constant across the plot.
p = @df df boxplot(:X, :Y, c=:black, fillcolor=:white, legend=false)
xt = xticks(p[1])[1]
yminmax = [extrema(filter(!isnan, p[1][3(i-1)+1][:y])) for i in axes(xt,1)]