Stacked bar graphs

svpillai · December 11, 2018, 9:08am

Hello all, I’m quite new to Julia and was a bit lost on how to create Stacked Bar graphs. I’d like to have ideally 8 bars, grouped in pairs. Each of these bars would have 5 stacks that add up to a 100%- this is basically plotting a variance decomposition. So it would look like 4 sets of 2 bars each, each grouped pair of bars would have the same repeating label of “Conditional” and “Unconditional Variance”, and the stacks would represent shocks/errors that have their own titles. Each pair would represent testing a certain model, i.e. 4 models being tested in total.

I’d appreciate any and all help, thank you!

Mattriks · December 11, 2018, 10:06am

In Gadfly:

import Cairo
using DataFrames, Gadfly, RDatasets
D = dataset("datasets","HairEyeColor")
palette = ["brown","blue","tan","green"]

p = plot(D, x=:Sex, y=:Freq, color=:Eye, xgroup=:Hair,
    Geom.subplot_grid(Geom.bar(position=:stack)),
    Scale.color_discrete_manual(palette...), 
    Guide.xlabel("Hair color"))

# draw(PNG("haireyecolor.png", 6.6inch, 4inch), p)
draw(PNG(6.6inch, 4inch), p)

haireyecolor

For more Gadfly, see the plot gallery.

mkborregaard · December 11, 2018, 10:09am

An alternative is to use StatPlots: https://github.com/JuliaPlots/StatPlots.jl#grouped-bar-plots

svpillai · December 11, 2018, 1:03pm

I’ve seen this and attempted it with StatPlots, but I’m not sure where to input the values (manually) for the stacks? And to group them in pairs?

svpillai · December 11, 2018, 1:03pm

great, thank you! If I wanted to input the values manually, how would this be arranged? I.e. in what order?

svpillai · December 11, 2018, 1:03pm

I was looking at attempting what you did in this: https://discourse.julialang.org/t/plots-bars-side-by-side but with manually adding stacks and putting in these values for each bar having 5 stacks.

piever · December 11, 2018, 1:36pm

Interestingly enough I think StatPlots and StatsMakie are both a bit ill-equipped to both stack and dodge bars at the same time. It should be possible to add this feature to StatsMakie with the new grouping API though.

Mattriks · December 11, 2018, 4:45pm

@svpillai The easiest way to enter data is to make a DataFrame. e.g.

# Set up the DataFrame
D = [[x model prop] for x in ["Conditional", "Unconditional"], 
        model in "Model".*["1","2","3","4"], prop in 1:5]
 D = DataFrame(vcat(D...), [:Variance, :Model, :Prop])

# Now enter the corresponding values in a new column like this:
D[:values] = [20, 40, ]

For more DataFrames, see the DataFrames docs.

mkborregaard · December 11, 2018, 9:15pm

Yes, but you can do it manually by defining the matrix values for groupedbars and passing the x positions to x. In this case I find the gadfly way nicer, though.

davidanthoff · December 13, 2018, 2:06am

Here is the VegaLite.jl way:

using VegaLite, RDatasets

df = dataset("datasets", "HairEyeColor")

df |> @vlplot(:bar, x=:Sex, y=:Freq, color=:Eye, column=:Hair)

and you get:
visualization

svpillai · December 17, 2018, 9:54am

this is perfect, thank you!

svpillai · December 21, 2018, 11:05am

Hi @davidanthoff, I tried this exact same code (all packages added), and the codes ran without error messages but on Atom/Juno’s Plots box, it just displayed an empty black screen. I tried creating my data frame as well, but it showed an empty box on the Plots screen and displayed messages saying “Invalid field type “undefined” for channel x” and the same for y, color, and column. I’d appreciate any help on this.

davidanthoff · December 21, 2018, 9:16pm

So when I run exactly the code I posted above in Juno/Atom, I get a plot with a black background that is hard to read. When I change things to df |> @vlplot(:bar, x=:Sex, y=:Freq, color=:Eye, column=:Hair, background=:white), all looks as it should. Does that work for you? Or are you seeing a completely black plot pane?

svpillai · December 24, 2018, 3:36pm

This worked perfectly somehow once I restarted Juno. However, when I use my own data frame with @Mattriks’s method above, I get this warning (and then an empty plot):
WARN Invalid field type “undefined” for channel “x”, using “quantitative” instead.
WARN Invalid field type “undefined” for channel “color”, using “nominal” instead.
WARN Invalid field type “undefined” for channel “y”, using “quantitative” instead.

What am I doing wrong here?

svpillai · December 24, 2018, 4:37pm

@Mattriks every time I try this code, and/or versions of it with other datasets, everything runs fine until the very end with the plot command, I get the following error:
ERROR: UndefVarError: plot not defined
Stacktrace:
[1] top-level scope at none:0

Is this a bug? I’m using Juno/Atom Julia v 1.0.2.

davidanthoff · December 24, 2018, 5:05pm

The code from @Mattriks creates a DataFrame where the columns have an element type of Any. That is not ideal, because it means that VegaLite.jl can’t determine what type of values are in each column, and the warning you are seeing essentially means “I’m going to guess a type for each column”. There is in general nothing wrong with that, i.e. it seems to guess the right column formats here.

If you want to get rid of these messages, there are two ways: 1) create a DataFrame with typed columns, or 2) manually tell VegaLite.jl what type of encoding you want to use for each channel. That would look like this:

df |>
  @vlplot(:bar, 
    x="col1:q", 
    y="col2:q", 
    color="col3:n", 
    column="col4:n")

Note how I’m now passing strings instead of symbols, and in addition to passing the name of the column, I’m also passing information what type of encoding this should be, after the colon :. The details of that are documented here. So I think you need to make sure that you pick the right type there for each encoding.

But overall, I would try to go with option 1). A DataFrame with Any columns is generally very inefficient.

pdeffebach · December 24, 2018, 6:40pm

Should VegaLite.jl do something like identity.(x) as an intermediate step so the eltype becomes narrower?

davidanthoff · December 25, 2018, 7:35pm

That would add another pass over all the input data, and in some cases that might end up being very inefficient… There is probably some things one could do that would be safe and would help a bit, but my gut feeling is that really we should steer folks towards not using Any columns, as really everything will be inefficient with them…

svpillai · December 26, 2018, 10:01pm

@davidanthoff This worked perfectly, thank you so much!

davidanthoff · January 3, 2019, 6:10pm

Like this:

df |>
  @vlplot(:bar,
    x={:Sex, title=nothing},
    y={:Freq, title=nothing}, 
    color={:Eye, title=nothing}, 
    column={:Hair, title=nothing}
  )

The trick is that you now pass a composite value in curly brackets {} to x etc., and then you can configure more details there. All the config options for axis title are described here.

I think you need to configure your scales a bit for that. It might be enough to configure the scale to not try to come up with nice numbers on the axis like this (documented here):

 df |>
  @vlplot(:bar, x=:Sex, y={:Freq, scale={nice=false}}, color=:Eye, column=:Hair)

But you might also have to additionally configure your domain:

 df |>
  @vlplot(:bar,
    x=:Sex,
    y={:Freq, scale={domain=[0,100], nice=false}},
    color=:Eye, column=:haircut_woman
  )

I’m not entirely sure why this doesn’t work automatically. I suspect, that maybe there is a small numerical rounding error that pushes some value just slightly above 100, and then vega-lite thinks it needs to extend the axis or something like that…

You can specify a sort property for the encoding, documented here:

 df |>
  @vlplot(:bar,
    x=:Sex,
    y=:Freq,
    color=:Eye,
    column={:haircut_woman, sort=["Red", "Blond", "Brown", "Black"]}
  )

Topic		Replies	Views
How to make stack graph become side by side with this dataset in julia language? Visualization question , package , plotting	0	396	November 21, 2022
PlotlyJS.jl: grouped + stacked bar charts General Usage plotting	8	1228	August 14, 2023
VegaLite.jl plot from multiple columns New to Julia vega	0	401	February 23, 2021
Stacked bars for different length General Usage visualization , statsplots	2	945	May 6, 2022
Current advice on stacked bar plots with Makie Visualization makie	2	1831	March 14, 2021

Stacked bar graphs

Related topics