Vegalite.jl; simple grouped bar chart

I’ve been fighting with this occasionally, and haven’t gotten it to work. I want to make a simple grouped bar chart out of a simple DataFrame, but no example I’ve seen that just quickly gets to the point.

Suppose it’s a simple one, having a few columns like Year, AmountOfPeople and AmountOfPeopleCasualty. You want to simply plot a bar chart where x has the year, and y has both the amount of people and the amount of casualties (out of these people), where these two columns are immediately next to each other per year.

Here is an example of a grouped bar chart. Essentially you have to create a faceted figure currently.

I believe the vega-lite team is also working on something better, see here.

1 Like

A MWE according to the example

julia> using DataFrames,VegaLite

julia> df=DataFrame(year=["90","91","92"],people=[10,12,14],cas=[5,6,7])
3×3 DataFrame
│ Row │ year   │ people │ cas   │
│     │ String │ Int64  │ Int64 │
├─────┼────────┼────────┼───────┤
│ 1   │ 90     │ 10     │ 5     │
│ 2   │ 91     │ 12     │ 6     │
│ 3   │ 92     │ 14     │ 7     │

julia> year=Array{String,1}()
0-element Array{String,1}

julia> name=Array{String,1}()
0-element Array{String,1}

julia> values=Array{Int,1}()
0-element Array{Int64,1}

julia> for index in 1:length(df.year)
           push!(year,df.year[index])
           push!(year,df.year[index])
           push!(name,"p")
           push!(name,"c")
           push!(values,df.people[index])
           push!(values,df.cas[index])
       end

julia> new_df=DataFrame(year=year,name=name,values=values)
6×3 DataFrame
│ Row │ year   │ name   │ values │
│     │ String │ String │ Int64  │
├─────┼────────┼────────┼────────┤
│ 1   │ 90     │ p      │ 10     │
│ 2   │ 90     │ c      │ 5      │
│ 3   │ 91     │ p      │ 12     │
│ 4   │ 91     │ c      │ 6      │
│ 5   │ 92     │ p      │ 14     │
│ 6   │ 92     │ c      │ 7      │

julia> new_df |> @vlplot(
                  :bar,
                  column={"year:o"},
                  y={"values", axis={grid=false} },
                  x={"name:n", axis={title=""} },
                  color={"name:n"},
                  spacing=0,
                  )

visualization

This example uses the support of Vega graphics in VegaLite.jl:

From the above example I have renamed the columns to make them more distinct from the keywords used in Vega (e.g. “values” and “value”).

This is the starting point from OPs example description:

using DataFrames,VegaLite
df=DataFrame(Year=["90","91","92"],AmountOfPeople=[10,12,14],AmountOfPeopleCasualty=[5,6,7])

julia> df
3×3 DataFrame
│ Row │ Year   │ AmountOfPeople │ AmountOfPeopleCasualty │
│     │ String │ Int64          │ Int64                  │
├─────┼────────┼────────────────┼────────────────────────┤
│ 1   │ 90     │ 10             │ 5                      │
│ 2   │ 91     │ 12             │ 6                      │
│ 3   │ 92     │ 14             │ 7                      │

First we have to restructure the data:

Year=Array{String,1}()
AmountOf=Array{String,1}()
Amount=Array{Int,1}()
for index in 1:length(df.Year)
	push!(Year,df.Year[index])
	push!(Year,df.Year[index])
	push!(AmountOf,"People")
	push!(AmountOf,"PeopleCasualty")
	push!(Amount,df.AmountOfPeople[index])
	push!(Amount,df.AmountOfPeopleCasualty[index])
end
new_df=DataFrame(Year=Year,AmountOf=AmountOf,Amount=Amount)

julia> new_df
6×3 DataFrame
│ Row │ Year   │ AmountOf       │ Amount │
│     │ String │ String         │ Int64  │
├─────┼────────┼────────────────┼────────┤
│ 1   │ 90     │ People         │ 10     │
│ 2   │ 90     │ PeopleCasualty │ 5      │
│ 3   │ 91     │ People         │ 12     │
│ 4   │ 91     │ PeopleCasualty │ 6      │
│ 5   │ 92     │ People         │ 14     │
│ 6   │ 92     │ PeopleCasualty │ 7      │

Now we create the grouped bar chart using a Vega JSON string:

julia> vg"""{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "width": 300,
  "height": 240,
  "padding": 5,

  "data": [
    {
      "name": "table"
    }
  ],

  "scales": [
    {
      "name": "xscale",
      "type": "band",
      "domain": {"data": "table", "field": "Year"},
      "range": "width",
      "padding": 0.2
    },
    {
      "name": "yscale",
      "type": "linear",
      "domain": {"data": "table", "field": "Amount"},
      "range": "height",
      "round": true,
      "zero": true,
      "nice": true
    },
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "table", "field": "AmountOf"},
      "range": {"scheme": "category20"}
    }
  ],

  "axes": [
    {"orient": "left", "scale": "yscale", "labelPadding": 4, "zindex": 1},
    {"orient": "bottom", "scale": "xscale"}
  ],

  "marks": [
    {
      "type": "group",

      "from": {
        "facet": {
          "data": "table",
          "name": "facet",
          "groupby": "Year"
        }
      },

      "encode": {
        "enter": {
          "x": {"scale": "xscale", "field": "Year"}
        }
      },

      "signals": [
        {"name": "width", "update": "bandwidth('xscale')"}
      ],

      "scales": [
        {
          "name": "pos",
          "type": "band",
          "range": "width",
          "domain": {"data": "facet", "field": "AmountOf"}
        }
      ],

      "marks": [
        {
          "name": "bars",
          "from": {"data": "facet"},
          "type": "rect",
          "encode": {
            "enter": {
              "x": {"scale": "pos", "field": "AmountOf"},
              "width": {"scale": "pos", "band": 1},
              "y": {"scale": "yscale", "field": "Amount"},
              "y2": {"scale": "yscale", "value": 0},
              "fill": {"scale": "color", "field": "AmountOf"}
            }
          }
        },
        {
          "type": "text",
          "from": {"data": "bars"},
          "encode": {
            "enter": {
              "x": {"field": "x", "offset": {"field": "width", "mult": 0.5}},
              "y": {"field": "y", "offset": 8},
              "fill": [
                {"value": "black"}
              ],
              "align": {"value": "left"},
              "baseline": {"value": "middle"},
              "text": {"field": "datum.AmountOf"},
              "angle": {"value": 90}
            }
          }
        }
      ]
    }
  ]
}"""(new_df, "table")

visualization
The following documentations have been consulted:
https://www.queryverse.org/VegaLite.jl/stable/userguide/vega/

I just yesterday merged a lot of new support for Vega specs! All of this is only available on VegaLite#master. In particular, there is now a @vgplot macro that works the same way as the @vlplot macro, just for Vega specs.The main benefit is that one doesn’t need to use these literal JSON strings anymore. The documentation for that new macro is here.The grouped bar chart example looks like this with the new syntax. The example is a little less nice than what @oheil posted because it has the data inline, instead of using a DataFrame. Would probably make sense to update the example in the docs to also use a DataFrame (this section in the docs exaplains how one passes external data to the @vgplot macro).

Oh, and one more small thing: try stack(df, Not(:Year), variable_name=:AmountOf, value_name=:Amount) for the DataFrame reshaping stuff :slight_smile:

2 Likes

Thanks, both.

It all looks very good. I hope to get at things soon.