Suggestion: move DataFrames, plotting into standard distribution

I see, so far the VegaLite syntax/functionality seems very similar to StatPlots @df macro. What I was toying with (only in my mind, no code written) was whether it made sense to “merge” the @select (now @map) and the plot statement, and all the selected columns would be included in the plot with keywords corresponding to the column name (and somehow grouping columns would be grouping in the plot as well). So for example:

load("mydata.csv") |> @filter(_.age>20) |> @map({x = _.colA, y = _.colB}) |> vlplot(:circle) |> save("figure.pdf")

The main advantage would be that one can do extra-processing in the @map step, for example:

load("mydata.csv") |> @filter(_.age>20) |> @map({x = _.colA, y = log(_.colB / ._colA)}) |> vlplot(:circle) |> save("figure.pdf")

(which in StatPlots we allow with the macro trick and dot broadcasting, but this new design has less duplication of work and let’s Query take care of all the data related things).

In GroupedErrors instead I have explicit @x and @y steps to choose the variables from the iterable tables and a @set_attr macro to set values of attributes according to the group (for example, if I want the line to be dashed or full according to some grouping variable, or any other attributes).

Glad to know you’re working on VegaLite syntax: it’d be really nice if that could be made more concise.

1 Like