How do you modify series attributes independently?
For example, to plot data [x,y, color, shape], I would like to plot in 2D, with each data point given its corresponding color and shape.
I have tried
@df df scatter(:x, :y, group=:color, color=[:red, :blue, :green])
@df df scatter!(:x, :y, group=:shape, shape=[:circle, :rect, :star5])
but this just overlays another group of points on top of the prexisting ones.
I am also familiar with
@df df scatter(:x, :y, group=(:color,:shape))
as in statsplots documentation, but this is unwanted, as it aggregates both series into all their permutations - instead of being able to modify attributes individually.
The issue is that the attributes are being passed as vectors instead of row matrices. Try like this:
using DataFrames, StatsPlots
df = DataFrame(x=1:10, y=rand(10), bin=rand(0:2,10))
@df df scatter(:x, :y, group=:bin, color=[:red :blue :green], shape=[:circle :rect :star5])
2 Likes
Hi, thank you for your reply. However, this is not the problem I have. In your example, you are just assigning a unique color & shape combination to each element in the range of “bin”.
Instead of having just one extra variable in addition to x-y space, like “bin”, I have two extra variables: “color” and “shape”. What I would like to do is for each (x,y) data point I would like to assign its color based on the column of data “color” and its shape based on the column of data “shape”.
I mention that I am familiar with @df df scatter(:x, :y, group=(:color,:shape))
, which allows me to set the group based on the tuple (:color, :shape), but this requires me to set color-shape combinations for each possible permutation of color’s & shape’s ranges.
I would much prefer to set the color based on just :color, and then set the shape based on just :shape. This would have the additional benefit of having two independent legends. Below is an example of such a plot done in R, to aid understanding of what my question is about.
You may try code below that works around StatsPlots to produce:
Plots / StatsPlots code
using DataFrames, StatsPlots; gr(dpi=600, size=(600,400))
stream = ["Glide", "Pool", "Riffle"]
colors = [:red, :blue, :green]
reach = ["Reach A", "Reach B", "Reach C"]
shapes = [:circle, :rect, :star5]
df = DataFrame(x=1:20, y=rand(20), reach=rand(reach,20), stream=rand(stream,20))
dshapes = Dict(reach .=> shapes)
dcolors = Dict(stream .=> colors)
layout = @layout [a b{0.1w}]
p1 = @df df scatter(:x, :y, color=getindex.((dcolors,), :stream), shape=getindex.((dshapes,), :reach), ms=4)
v = [[0] for _ in 1:length(reach)];
p2 = scatter(v, v, lims=(-2,-2), fs=:none, mc=:white, label=label=" " .* permutedims(reach), shape=permutedims(shapes), legend=:top, legendfontsize=3, ms=3, msw=0.5)
v = [[0] for _ in 1:length(stream)];
scatter!(v, v, lims=(-2,-2), framestyle=:none, mc=permutedims(colors), label=" " .* permutedims(stream), ms=4, msw=0, fg_legend = :transparent)
plot(p1, p2, layout=layout)