VegaLite.jl: render as raw image

jtackm · September 11, 2019, 11:38am

I ran into a problem when using VegaLite (+Jupyter) on larger datasets (80k+ rows, 6 variables). When rendering a scatterplot inline in a notebook, jupyter hangs when saving. This is a MWE:

df = DataFrame([Symbol("var$i")=> rand(1:100, 80_000) for i in 1:6]...)

df |> @vlplot(
    mark={typ=:point, filled=true},
    x=:var1,
    y=:var2,
    size={value=1}
)

Presumably this is due to the overhead of saving the whole vegalite specification (which for instance includes the whole dataset, I think). My question: since I only require interactivity in specific cases, is it possible to make VegaLite.jl render the output directly as png or pdf to circumvent the vegaspec overhead?

leethargo · September 11, 2019, 3:31pm

I don’t know how to do that, but I found that the Jupyter notebook often contains three different pieces of data for VegaLite (the raw data, the PNG and the SVG) and wondered whether a keyword argument could be added to select only one of these.

davidanthoff · September 11, 2019, 5:19pm

Yes, that is a good idea. Maybe one way to do this would be to pipe it into a PNG type, so something like df |> @vlplot(...) |> PNGImage works… Then what would be displayed is PNGImage, and not the actual plot object itself. Could you open an issue about this on the VegaLite.jl repo, and we can discuss some options there?

In general the story around large datasets is not great right now for VegaLite.jl, because we also use this incredibly inefficient JSON representation of the data to hand it off to the vega engine. My plan is to transition that over to arrow, once the rewrite of Arrow.jl is done. That might help a lot with non-Jupyter situations, but it is actually not clear to me how we might be able to solve the large data problem in Jupyter itself…

jtackm · September 11, 2019, 5:44pm

Thanks, I created the issue: https://github.com/queryverse/VegaLite.jl/issues/187. We can further discuss there. For everyone struggling with this, I’m now using the following workaround:

df = DataFrame([Symbol("var$i")=> rand(1:100, 80_000) for i in 1:6]...)

path = mktemp()[1] * ".png"
df |> @vlplot(
    mark={typ=:point, filled=true},
    x=:var1,
    y=:var2,
    size={value=1}
) |> save(path)

open(path) do f
    display("image/png", read(f))
end

davidanthoff · September 11, 2019, 7:31pm

Actually, turns out we already have something in the code base that can be used in this case Try this:

df |> @vlplot(...) |> VegaLite.MimeWrapper{MIME"image/png"}

Currently it is only used in the test suite, but I think it should achieve what you want in this case as well.

For now I’m slightly hesitant to make this part of the public API. But, for now I have no plans to remove it or break it.

tkf · September 11, 2019, 7:38pm

I didn’t know VegaLite.jl had it so I created DisplayAs.jl which can be used like this

df |> @vlplot(...) |> DisplayAs.PNG

It’s not registered ATM, though.

davidanthoff · September 11, 2019, 8:13pm

Ah, that is nice! I think it would be great if you were to register that, seems a much cleaner solution to have this in its own package, and then I could get rid of MimeWrapper in VegaLite.jl, which always felt strange there.

tkf · September 12, 2019, 1:40am

Here we go https://github.com/JuliaRegistries/General/pull/3474

jtackm · September 12, 2019, 10:12am

That’s very cool, works like a charm. Looking forward to DisplayAs.jl as well!

Topic		Replies	Views
[ANN] VegaLite.jl v2.0 released Package Announcements announcement	4	1653	March 16, 2020
Creating a PNG-image of a VegaLite-diagram directly in memory? General Usage images , vegalite , streaming	1	282	October 5, 2023
VegaLite.jl: How to pass in a DataFrame and apply data parameters Visualization	0	337	November 10, 2020
VegaLite vlplot basic usage, encoding Visualization question , plotting , vegalite	1	792	March 22, 2021
Is VegaLite.jl fully written in julia language? General Usage first-steps , visualization	2	598	February 17, 2020

VegaLite.jl: render as raw image

Related topics