A preview of `StatisticalGraphics` a new package for data visualisation in Julia

I am very excited to share a preview of StatisticalGraphics, a package that I have developed for statistical data visualisation in Julia. The idea of the package is to develop a simple to use and yet powerful tool for statistical graphical presentation that allows users to customise every bit of the output.

The main function exported by the package is sgplot. Users should pass an AbstractDataset (see InMemoryDatasets.jl) as the first positional argument of sgplot and a vector of plot types (like Scatter, Line, Bar, …) as the second positional argument of it. The sgplot function produces the requested plots based on the values in the passed data set and overlays all of the produced plots on a single graph.

Interestingly, when users pass a grouped data set (created by the groupby, groupby!, or gatherby functions - see Group observations), sgplot produces a separate graph for each group of observations and put each graph in a separate panel.

The following examples illustrate the basic ideas of the package;

  • Load packages and generate a sample data set
using StatisticalGraphics
using InMemoryDatasets

ds = Dataset(x=1:100, y=rand(100), group=rand(["g1", "g2"], 100));
  • Create a line plot
sgplot(ds, Line(x=:x, y=:y))

graph1

  • More than one plot type can be passed as the second positional argument,
sgplot(ds, [
              Line(x=:x, y=:y),
              Scatter(x=:x, y=:y)
           ])

graph2

  • The plots in StatisticalGraphics usually support the group keyword argument to create the same plot for each group of observation - the final graph is an overlay plot and the groups are distinguished by colour,
sgplot(ds, [
              Line(x=:x, y=:y, group=:group),
              Scatter(x=:x, y=:y, group=:group)
           ])

graph3

  • To produce a panel of graphs, user can pass a grouped data set as the first positional argument,
sgplot(groupby(ds, :group), [
              Line(x=:x, y=:y, group=:group),
              Scatter(x=:x, y=:y, group=:group)
           ])

graph4

Bar Chart Examples

The following examples demonstrate customised Bar chart with non-English axes info.

fun_example = Dataset(rand(1:5, 1000, 4), :auto)

sgplot(
        groupby(fun_example, [:x3, :x4]),
        Bar(x=:x1, group=:x2, grouporder=:data, barwidth=:nest, barcorner=15),
        nominal=:x2,
        xaxis=Axis(order=:ascending, titlecolor=:steelblue, values=([1,2,3,4,5], ["一","二","三","四","五"])),
        yaxis=Axis(title = "频率", titlecolor=:darkred),
        layout=:lattice,
        columnspace=5,
        rowspace=5,
        headercolor=:darkgreen,
        wallcolor=:lightgray,
        height=100,
        width=100,
        font="Euclid", # graph default font
        italic=true
    )

right to left characters,

The package is registered, however, there is no official release yet and therefore users should clone the package to access the latest features. Probably, the following items should be resolved before the first release;

  • The package should work without internet connection
  • Currently, parsing DateTime type is not consistent
  • Bar chart must allow bar labelling
  • A few plot types are missing from the package and should be added, e.g. Text, …
  • Documentation and user guides are missing
  • There is no test
6 Likes