Ordering the y-axis values in a dotplot

Hi, I’m trying to make a “dotplot”, e.g. see here, with Plots.jl. It mostly works, but I would like to order the categorical values that are shown on the y-axis.

Here is what I have so far,

newplot%20(9)

It is produced by

@df d2d scatter(:when, :volume, xrotation=30, markersize=4, markeralpha=0.01, legend=nothing, group=:mover)

:volume is a categorical column. I have sorted the levels, …

> levels(d2d[:volume])
14-element Array{String,1}:
 "VPD430(GM2)"
 "VPD434(GM2)"
 "VPD435(GM2)"
 "VPD436(GM2)"
 "VPD440(GM2)"
 "VPD441(GM2)"
 "VPD442(GM2)"
...

But the plot does not honor this sorting. I even sorted the whole DataFrame by :volume and that didn’t change the ordering of the values on the y-axis. Does anyone know how to order the categorical values on the y-axis? Thanks! – Adam

An easy way to do it would be to replace :volume with a number specifying the y location, then override yticks to show the names, e.g. scatter(:when, :numvolume, ticks = (1:14, levels(:volume)). But agreed it would make sense if StatPlots would respect the level-order of ordered categorical columns.

1 Like

Thanks! Along what you suggested, this works nicely (though is very verbose)…

@df d2d scatter(:when, d2d[:volume].refs, ticks=(1:length(levels(:volume)), levels(:volume)), xrotation=30, markersize=4, markeralpha=0.01, group=:mover)

I wonder if it is an issue with categorical arrays or with @df (see discussion here).

Does

scatter(d2d[:when], d2d[:volume])

produce the correct order?

Doesn’t look good…

> scatter(d2d[:when], d2d[:volume])
No user recipe defined for CategoricalArrays.CategoricalString{UInt32}

Looks like a bug, ideally we should support any AbstractArray, could you open an issue at Plots.jl repository?

julia> CategoricalArrays.CategoricalString{UInt32} <: AbstractArray
false

Ups, my bad, so the problem is not supporting the container type but supporting categorical values. In particular seems like we should allow categorical values in Plots using their own ordering. Not sure why the @df macro would work at all then, will investigate.

EDIT: actually at least in my hands it does not work:

julia> @df df scatter(:x, :x)
ERROR: No user recipe defined for CategoricalArrays.CategoricalValue{Int64,UInt32}

I wonder if this is something the @df macro should take care of or if it’s actually simpler to fix it in Plots (may be as simple as widening some method signatures to accept categorical values).

Would be best to fix in Plots IMHO. Maybe just having a type recipe on them? I guess the ordering means they need some special treatment. Essentially they should be treated more like symbols/strings in the code I guess.

1 Like

Yes, I think they should be treated like string and comparing with isless or sorting with sort will automatically give the correct order.