Upcoming AlgebraOfGraphics release

It’s planned but not yet implemented. It would require a bit of “clean up” of the API for that in AbstractPlotting. It’s mostly a matter of consistency, so that AlgebraOfGraphics can look for the correct attribute and manage the axes correctly.

For those who want to participate in the API discussion, the relevant issue is orientation versus direction · Issue #926 · JuliaPlots/Makie.jl (github.com).

2 Likes

It would be really cool if the millimeter/centimeter handling could be done by Unitful.jl, and AoG could just handle it natively, instead of needing to manually add " (mm)" etc.

1 Like

A comparison of AoG with ggplot2 in R would be useful. I might even do one myself, but I don’t have time in the coming days at least. If it would exist, what would it be? Would it simply be the AoG tutorial, but coded in R/ggplot2?

I did some learning of R and ggplot2 (+ statistics) last year, and I really liked ggplot2. For some reason I don’t find myself liking Gadfly.jl much, but I can’t explain why. It’s probably the way the documentation looks? I haven’t really used it either, for that reason.

1 Like

On this topic, there’s an important point to keep in mind: AoG is not “ggplot2 in julia”. I confess I’ve actually never used ggplot2, so I don’t really know what that would look like. AoG had its own design history, originating in Plots.jl grouping capabilities.

My concern is that people reading this comparison might think that the same coding style of ggplot2 would work in AoG. While AoG borrows heavily from the layered grammar of graphics, I suspect knowing ggplot2 does not make it easier to learn AoG and vice versa.

Just to point out the first few key differences that come to mind:

  • One of AoG’s design principles is to be “magic-free” and composable, which makes things slightly more verbose but allows to create reusable building blocks. Ggplot2 is a bit less verbose by default, but IMO much less composable. Even refactoring some code into a function seems pretty awkward.

  • AoG requires the users to figure out how the two operations + and * work together, which is a bit more complex than just one operation as in ggplot2. On the flip side, AoG does not resort to passing some arguments where they don’t belong to obviate the lack of a second operation.

  • From what I understand, ggplot2 has many helper functions used to customize it. AoG tries to mostly “get out of the way” and allow you to customize things using Makie’s capabilities.

OTOH, your question (how does it compare to ggplot2?) is a natural one, so maybe this is another good item for a FAQ section in the doc.

Important clarification: while the above bullet points may seem negatively biased against ggplot2, I have nothing against it and think it’s a very interesting library. Still, I thought it was important to discuss why AoG differs from ggplot2 in some key design decisions.

10 Likes

+1 on the “magic-free”. DataFrames made the decision to be “magic-free” and this allows for meta-programming tools to simplify syntax. Having a base be magic free is always a good decision.

6 Likes

I’m no expert on ggplot2. I know enough to be able to do plots like those in the AoG tutorial, so when I have time I’ll see how it looks in ggplot2.

Comparing libraries like this has some intrinsic value for me. It helps me see how what I’ve learned before in one area can help in the new area. Such examples also help me decide between technologies. There’s not really a question for me here, I’m not going to start coding lots of R, but if the question existed, the examples would help.

2 Likes

Quick update: I’ve just tagged v0.4 of AlgebraOfGraphics, so at soon as the registry PR goes through it’ll be possible to install it with just ] add AlgebraOfGraphics.

Recent changes:

  • Switch to new Makie release (Makie 0.13)
  • Machine Learning section in the tutorial (very simple, but shows some nice tricks that can be done when some columns come from ML models)
  • Geo data support
  • Restored support for pre-grouped data (not coming from a table)
11 Likes

I’ve just read the philosophy section of the docs and I’m really blown away! This seems like an amazing package. Thanks for all your work.

2 Likes

Unfortunately I cant update past version 0.2.1 due to Unsatisfiable requirements detected for package JLD2. Clashing with a whole host of packages.

Not sure if this helps, but in case your conflict happens because you are working in your default environment with a lot of packages, you could try creating a new environment, see Pkg docs.

1 Like

I’ve trying out AlgebraOfGraphics and really like it so far. Sometime I can plots something with one line command that used to take dozens of lines of code which is awesome!

I have a simple question or a feature request if this is not possible.

I really like the layout option (and related col/row option) for plotting many plots in a grid. Here’s a code example:

using AlgebraOfGraphics, CairoMakie
CairoMakie.activate!(type = "svg")

df = (x=rand(100), y=rand(100), i=rand(["a", "b", "c", "d", "e", "f"], 100))
plt = data(df) * mapping(:x, :y, layout=:i)
fig = draw(plt)

I also like how it’s possible to change various things about each of the plots in the layout like title, tick rotations etc.

Axis(fig.grid[4]).xticklabelrotation[] = π/2
fig

But it seems like when I change axis scale or limits it change those for all plots instead of one plot that I selected.

Axis(fig.grid[4]).xscale[] = log10
fig

Is it possible to make only one of the subplot be with log10 axis for example or to change the xlim? If not, that would be a great feature to add as many things can be changed about the select plots in the grid automatically using loops or simple commands like above.

Thanks. I’m only just figuring out environments now!

1 Like

There is a bit of the API that is underdocumented, about how to actually draw AlgebraOfGraphics.Layers.

There are both draw and plot. plot simply plots the data, whereas draw also completes the figure by adjusting the layout and adding a legend. In particular, draw also links axes in a facet plot, which I imagine is what you are seeing here. The code is here, if you are curious.

If you only do fg = plot(plt), the axes would not be linked so changing one should not affect the others. Note that things that do not involve axes limits and scales should work independently across axes also with fg = draw(plt), it’s just the scale and limits that are synced.

Out of curiosity: do you have concrete examples where you would want to use a logarithmic scale only on some plots in a facet?

Hi @piever, plot() doesn’t seem to help.

I found a bit of hacky way to partially do this using Axis(ae).block_limit_linking[] = true like this:

using AlgebraOfGraphics, CairoMakie, DataFrames
CairoMakie.activate!(type = "svg")
set_aog_theme!()

df = DataFrame(x=rand(100), y=rand(100), i=rand(["a", "b", "c", "d", "e", "f"], 100))

plt = data(df) * histogram(normalization=:probability) * mapping(:x, layout=:i)
fig = draw(plt)

for ae in fig.grid
    Axis(ae).block_limit_linking[] = true
    Axis(ae).title[] == "a" ? Axis(ae).limits[] = ((-20,20),nothing) : Axis(ae).limits[] = ((1e-9,1e0),nothing)
    Axis(ae).title[] == "a" ? Axis(ae).xscale[] = identity : Axis(ae).xscale[] = log10
end
fig

However, there’s still some issue because if I have any negative values in the data in the columns that are not on log10 then it doesn’t work as this example:

using AlgebraOfGraphics, CairoMakie, DataFrames
CairoMakie.activate!(type = "svg")
set_aog_theme!()

df = DataFrame(x=rand(100) .- 0.5, y=rand(100), i=rand(["a", "b", "c", "d", "e", "f"], 100))
df.x[df.i .== "a", :] = 1 .+ df.x[df.i .== "a", :].^2

plt = data(df) * histogram(normalization=:probability) * mapping(:x, layout=:i)
fig = draw(plt)

for ae in fig.grid
    Axis(ae).block_limit_linking[] = true
    Axis(ae).title[] == "a" ? Axis(ae).limits[] = ((-20,20),nothing) : Axis(ae).limits[] = ((1e-9,1e0),nothing)
    Axis(ae).title[] == "a" ? Axis(ae).xscale[] = identity : Axis(ae).xscale[] = log10
end
fig

Display Error: ERROR: DomainError with -0.5:
NaN result for non-NaN input.
Stacktrace:
  [1] nan_dom_err
    @ ./math.jl:429 [inlined]
  [2] log10
    @ ./math.jl:576 [inlined]
  [3] apply_transform(f::Tuple{typeof(log10), typeof(identity)}, point::Vec{2, Float32})
    @ Makie ~/.julia/packages/Makie/gyI4W/src/layouting/transformation.jl:265
  [4] project_position(scene::Scene, point::Vec{2, Float32}, model::StaticArrays.SMatrix{4, 4, Float32, 16})
    @ CairoMakie ~/.julia/packages/CairoMakie/D0iNC/src/utils.jl:8
  [5] etc...

Seems like something is still linked somewhere and in my actual code I do have negative values in plots that are on linear axis so this doesn’t work.

I work in biology research and I use Julia for modeling so I frequently find myself plotting things on log and linear axis on the same plot. For this particular plot, I’m plotting the distribution of bootstrapped parameters values that I got by fitting an equation describing rate of an enzyme to data using BlackBoxOptim.jl. Some of the parameters are energy and have physically reasonable range of (-20, 20) and other parameters are binding constants and have a range (1e-9,1000) so it’s easier to look at all of them if I plot histogram of some on linear and others on log x-axis. I think many people who use Julia for modeling/differential equations have to plot things on log and linear axis all the time so an easy way to do this in AlgebraOfGraphics should be broadly useful.

I like your package a lot and looking forward to seeing how it evolves. Thanks for all you hard work on this!

2 Likes

Off Topic: I somehow feel that the “long format”, where all model parameters are in the same column and another column specifies which parameter it is, might be a bad fit for this situation. I just wanted to mention that AlgebraOfGraphics supports a wide format as well, which IMO works better here, esp. if you later want to see interactions among parameters. For example, to see if in a non-parametric bootstrap of your model fits, different parameters are correlated or not (I guess this could help understand whether the number of parameters can be brought down).

For example, you can do things like

julia> df = (a = rand(100), b = rand(100), c = exp.(rand(100)));

julia> vars = [n == :c ? (n => log10 => "log10($n)") : n for n in [:a, :b, :c]];

julia> plt = histogram(normalization=:pdf, bins=30) * data(df) * mapping(vars, col=dims(1));

julia> draw(plt, axis=(width=225, height=225))

About your issue with log scales, I suspect one of the problem is that be default the histogram bins are linked, but here you clearly would like them to be separate. I suspect a reasonable rule could be that bins are the same for a given variable, but selected freely across variables (ie in the wide format).

1 Like

@piever definitely agree that wide format is more “natural” here. Sorry, I actually use wide format in practice but used long format in my MWE by copy pasting some code from your AlgebraOfGraphics tutorial. Using my actual data, I did plot the wide format DF using layout=dims(1) as you suggest which is very convenient. I do calculate correlations among parameters all the time and usually plot heatmap of correlations using heatmap(cor(Matrix(wide_format_DataFrame))) in Makie for example.

Your comment about log scales and bins sounds right and anyways you usually need to select log-scaled bins in order for log-scaled histogram to look nice or to log the data as in your example. Is it currently possible to pass an array of different bins or to change bins of histograms after plotting FigureGrid in AlgebraOfGraphics?

About different sized bins, I don’t think there is currently support for that.

About passing different bins per subplot, it’s an interesting question and I think it’s slightly more general, as I imagine for several analyses you may want that some keyword arguments are different for different traces or different subplots. I don’t have a good API in mind for that, I’ve opened #199 to discuss possible options.

I think you can just pass any edges you want to hist!(..., bins = as long as they are sorted.

So you could easily make log spaced edges yourself, although maybe a feature to switch to log spacing when only specifying the number of bins would be nice.

So I think visual(Hist, bins=log_edges) should work

1 Like

+2 on magic free… ggplot is … not that … kinda the opposite of that. Makes it near impossible to use outside of simple interactive uses.

1 Like