VegaLite.jl: render as raw image

I ran into a problem when using VegaLite (+Jupyter) on larger datasets (80k+ rows, 6 variables). When rendering a scatterplot inline in a notebook, jupyter hangs when saving. This is a MWE:

df = DataFrame([Symbol("var$i")=> rand(1:100, 80_000) for i in 1:6]...)

df |> @vlplot(
    mark={typ=:point, filled=true},
    x=:var1,
    y=:var2,
    size={value=1}
)

Presumably this is due to the overhead of saving the whole vegalite specification (which for instance includes the whole dataset, I think). My question: since I only require interactivity in specific cases, is it possible to make VegaLite.jl render the output directly as png or pdf to circumvent the vegaspec overhead?

I don’t know how to do that, but I found that the Jupyter notebook often contains three different pieces of data for VegaLite (the raw data, the PNG and the SVG) and wondered whether a keyword argument could be added to select only one of these.

Yes, that is a good idea. Maybe one way to do this would be to pipe it into a PNG type, so something like df |> @vlplot(...) |> PNGImage works… Then what would be displayed is PNGImage, and not the actual plot object itself. Could you open an issue about this on the VegaLite.jl repo, and we can discuss some options there?

In general the story around large datasets is not great right now for VegaLite.jl, because we also use this incredibly inefficient JSON representation of the data to hand it off to the vega engine. My plan is to transition that over to arrow, once the rewrite of Arrow.jl is done. That might help a lot with non-Jupyter situations, but it is actually not clear to me how we might be able to solve the large data problem in Jupyter itself…

2 Likes

Thanks, I created the issue: https://github.com/queryverse/VegaLite.jl/issues/187. We can further discuss there. For everyone struggling with this, I’m now using the following workaround:

df = DataFrame([Symbol("var$i")=> rand(1:100, 80_000) for i in 1:6]...)

path = mktemp()[1] * ".png"
df |> @vlplot(
    mark={typ=:point, filled=true},
    x=:var1,
    y=:var2,
    size={value=1}
) |> save(path)

open(path) do f
    display("image/png", read(f))
end

Actually, turns out we already have something in the code base that can be used in this case :slight_smile: Try this:

df |> @vlplot(...) |> VegaLite.MimeWrapper{MIME"image/png"}

Currently it is only used in the test suite, but I think it should achieve what you want in this case as well.

For now I’m slightly hesitant to make this part of the public API. But, for now I have no plans to remove it or break it.

I didn’t know VegaLite.jl had it so I created DisplayAs.jl which can be used like this

df |> @vlplot(...) |> DisplayAs.PNG

It’s not registered ATM, though.

2 Likes

Ah, that is nice! I think it would be great if you were to register that, seems a much cleaner solution to have this in its own package, and then I could get rid of MimeWrapper in VegaLite.jl, which always felt strange there.

Here we go https://github.com/JuliaRegistries/General/pull/3474

2 Likes

That’s very cool, works like a charm. Looking forward to DisplayAs.jl as well!