Adding N to each box

I’m trying to see the sample size in each boxplot using AoG. Looking for a robust solution that allows to easily see the number of elements in each box regardless of whether we use faceting, colors, dodge, e.g. Is this an impossible task?

You want the categories keyword to draw()

using AlgebraOfGraphics
using CairoMakie
using Random

Random.seed!(42)

# Data setup
counts = Dict("A" => 30, "C" => 22, "D" => 45)
groups = vcat([fill(g, counts[g]) for g in ["A", "C", "D"]]...)
values = vcat(
    randn(counts["A"]) .* 1.2 .+ 1.0,
    randn(counts["C"]) .* 0.8 .+ 3.5,
    randn(counts["D"]) .* 1.5 .+ 4.5,
)

df = (; group = groups, value = values)
spec = data(df) * mapping(:group, :value) * visual(BoxPlot)

f = Figure(size = (1200, 400))

# Helper to create the pair mappings for AoG scales
# Returns something like: ["A" => "A\nN=30", ...]
function label_with_n(keys)
    return [k => "$k\nN=$(counts[k])" for k in keys]
end

# ae1: Default order (Alphabetical)
ae1 = draw!(f[1, 1], spec, scales(X = (; categories = label_with_n(["A", "C", "D"]))))

# ae2: Custom order ("D", "A", "C")
ae2 = draw!(f[1, 2], spec, scales(X = (; categories = label_with_n(["D", "A", "C"]))))

# ae3: Using your[1] explicit dictionary-style mapping
# Note: AoG expects an AbstractVector of Pairs for the categories scale
ae3_labels = ["A" => "A\nN=$(counts["A"])", "C" => "C\nN=$(counts["C"])", "D" => "D\nN=$(counts["D"])"]
ae3 = draw!(f[1, 3], spec, scales(X = (; categories = ae3_labels)))

f


[1] Yes, I got help with this. I took a swing at it, missed, thought about it, and properly presented the question to Gemini.

Thanks! This might help, but what if we have this?

Also, I guess the point of AoG is that there should be no need to write factor levels, right?

I had a PR for aggregations Add flexible `aggregate` analysis layer by jkrumbiegel · Pull Request #696 · MakieOrg/AlgebraOfGraphics.jl · GitHub but after merging it into a 0.12 branch already, I didn’t use it after all. Too many edge cases and not enough flexibility after all, I found. Until I can come up with a better solution, I decided to leave aggregations to data wrangling packages, but your example is a good one why it can be annoying. As you say, you’d like this to work with color grouping, dodge, whatever. Making a custom AoG analysis type just for this would work but is a bit heavy-handed. Also this one only works with a small fix to the dodging resolution mechanism in AoG I’ll need to make (another annoyance here is that boxplot needs dodge while Text needs the generic dodge_x):

using AlgebraOfGraphics, CairoMakie
using AlgebraOfGraphics: transformation, ProcessedLayer, Verbatim

struct GroupCount end

function (::GroupCount)(input::ProcessedLayer)
    output = map(input) do p, n
        xs, ys = p
        uxs = sort(unique(xs))
        counts = [count(==(x), xs) for x in uxs]
        maxys = [maximum(ys[xs .== x]) for x in uxs]
        return (uxs, maxys), Dict(:text => [Verbatim("n=$c") for c in counts])
    end
    return ProcessedLayer(output; plottype = Makie.Text)
end

groupcount() = transformation(GroupCount())

df = (;
    group = [fill("A", 10); fill("A", 20); fill("B", 30); fill("B", 50)],
    color = [fill("x", 10); fill("y", 20); fill("x", 30); fill("y", 50)],
    value = [randn(10) .+ 1; randn(20) .+ 2; randn(30) .+ 2; randn(50) .+ 3],
)
draw(data(df) * mapping(:group, :value, color = :color) *
    (visual(BoxPlot) * mapping(dodge = :color) + groupcount() * mapping(dodge_x = :color)))

@jules solution seems to address this, except for the notch which can be added with show-notch = true, as an alternative to using the categories keyword. Not sure what you mean by writing factor levels.

Thanks, wow! I tried it with the latest version of AoG but I get this error:

ERROR: Dictionaries.IndexError("Dictionary does not contain index: text")
Stacktrace:
  [1] getindex
    @ ~/.julia/packages/Dictionaries/P05gc/src/AbstractDictionary.jl:46 [inlined]
  [2] (::AlgebraOfGraphics.var"#to_entry##8#to_entry##9"{Dictionaries.Dictionary{…}, Dictionaries.Dictionary{…}, Dictionaries.Dictionary{…}, Dictionaries.Dictionary{…}, Symbol, Type{…}, Type{…}})(::Pair{Symbol, Vector{…}})
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:546
  [3] map!(f::AlgebraOfGraphics.var"#to_entry##8#to_entry##9"{…}, out::Dictionaries.Dictionary{…}, d::Dictionaries.PairDictionary{…})
    @ Dictionaries ~/.julia/packages/Dictionaries/P05gc/src/map.jl:66
  [4] map
    @ ~/.julia/packages/Dictionaries/P05gc/src/map.jl:106 [inlined]
  [5] to_entry(P::Type, p::ProcessedLayer, categoricalscales::Dictionaries.Dictionary{Type{…}, Dictionaries.Dictionary{…}}, continuousscales::Dictionaries.Dictionary{Type{…}, Dictionaries.Dictionary{…}})
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:545
  [6] to_entry(p::ProcessedLayer, categoricalscales::Dictionaries.Dictionary{Type{…}, Dictionaries.Dictionary{…}}, continuousscales::Dictionaries.Dictionary{Type{…}, Dictionaries.Dictionary{…}})
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:502
  [7] #compute_entries_continuousscales##6
    @ ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:185 [inlined]
  [8] iterate
    @ ./generator.jl:48 [inlined]
  [9] collect_to!
    @ ./array.jl:848 [inlined]
 [10] collect_to_with_first!
    @ ./array.jl:826 [inlined]
 [11] _collect(c::Vector{ProcessedLayer}, itr::Base.Generator{Vector{…}, AlgebraOfGraphics.var"#compute_entries_continuousscales##6#compute_entries_continuousscales##7"{…}}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base ./array.jl:820
 [12] collect_similar
    @ ./array.jl:732 [inlined]
 [13] map
    @ ./abstractarray.jl:3372 [inlined]
 [14] #compute_entries_continuousscales##4
    @ ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:184 [inlined]
 [15] #Generator##0
    @ ./generator.jl:37 [inlined]
 [16] iterate
    @ ./generator.jl:48 [inlined]
 [17] collect(itr::Base.Generator{Base.Iterators.Zip{Tuple{…}}, Base.var"#Generator##0#Generator##1"{AlgebraOfGraphics.var"#compute_entries_continuousscales##4#compute_entries_continuousscales##5"{…}}})
    @ Base ./array.jl:790
 [18] map
    @ ./abstractarray.jl:3526 [inlined]
 [19] compute_entries_continuousscales(pls_grid::Matrix{Vector{…}}, categoricalscales::Dictionaries.Dictionary{Type{…}, Dictionaries.Dictionary{…}}, scale_props::Dictionaries.Dictionary{Type{…}, Dictionaries.Dictionary{…}})
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:183
 [20] compute_axes_grid(d::Layers, scales::AlgebraOfGraphics.Scales; axis::Dictionaries.Dictionary{Symbol, Any})
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:349
 [21] compute_axes_grid
    @ ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/algebra/layers.jl:314 [inlined]
 [22] _draw(d::Layers, scales::AlgebraOfGraphics.Scales; axis::Dictionaries.Dictionary{…}, figure::Dictionaries.Dictionary{…}, facet::Dictionaries.Dictionary{…}, legend::Dictionaries.Dictionary{…}, colorbar::Dictionaries.Dictionary{…})
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/draw.jl:186
 [23] _draw
    @ ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/draw.jl:181 [inlined]
 [24] draw(d::Layers, scales::AlgebraOfGraphics.Scales; axis::@NamedTuple{}, figure::@NamedTuple{}, facet::@NamedTuple{}, legend::@NamedTuple{}, colorbar::@NamedTuple{}, palette::Nothing)
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/draw.jl:112
 [25] draw
    @ ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/draw.jl:100 [inlined]
 [26] draw(d::Layers)
    @ AlgebraOfGraphics ~/.julia/packages/AlgebraOfGraphics/T2K4z/src/draw.jl:100
 [27] top-level scope
    @ REPL[29]:1
 [28] top-level scope
    @ REPL:1
Some type information was truncated. Use `show(err)` to see complete types.

In the plot made by jules, A and B would be factors (or groups) and x and y their levels.

That’s why I wrote the above :slight_smile:

One thing I thought about adding is that one can use the dodge_x thing also for the recipes that already have their own dodge and it would work as long as the dodge is actually in x. That would make it easier to layer dodge-having recipes with those that don’t.

Oh sorry! Anyway, if you can let me know when this can be used at least from a devel version, it would be great! :blush:
In the meantime I’ll try to study the solution.