PythonPlot: use imshow instead of pcolormesh for heatmaps

I have heatmaps to plot with julia. I sometimes use the plotlyjs backend to zoom portions of the plot, but for export I use pythonplot because all other backends I’ve tried either have problems (e.g. can’t do logarithmic scaling of the colorbar, lack of features I need) or just don’t look good.

The problem is that pythonplot’s heatmap seems to use pcolormesh, which turns every datapoint into a colored rectangle. With around 100 000 data points, that’s really slow, rendering the plot unusable. My current workaround is to plot using both pythonplot and gr, then delete the colored rectangles in inkscape and copy over and scale the bitmap from the gr plot. That may take 10-15 minutes because inkscape freezes for a few minutes every time I try to do something.

Is it possible to instead use imshow, but otherwise using the same commands? It should be possible to achieve the same layout and apperance. I really don’t want to use matplotlib directly, its syntax is unwieldy and incompatible with everything else, which I want to optionally use as well.

Cheers,
Malte

That is only a 317*317 heatmap, which is fast.

It’s fast with GR and PlotlyJS, but it’s very very slow with PythonPlot or PGFPlotsX because these generate a colored rectangle for each pixel, rather than embedding a bitmap. Maybe the plotting itself isn’t that slow, but displaying the heatmap definitely is and editing or embedding the plot in a document even more so.

Example code:

using Plots; pythonplot()

@time begin
    A = rand(300,300)
    heatmap(A)
    savefig("heatmap.svg")
end

Output

9.303465 seconds (4.89 M allocations: 101.184 MiB, 0.96% gc time)

generates a 16 MB (!) file

now change pythonplot() to gr() and you get

0.269523 seconds (726.28 k allocations: 25.478 MiB)

generates a 647 kB file.

for more fun, try opening the 16 MB heatmap.svg in inkscape and try editing it.

edit:
The winner seems to be pgfplotsx() with

222.957521 seconds (3.08 M allocations: 591.857 MiB, 0.04% gc time)

and a filesize of a whopping 37 MB. Saving to .tex instead of .svg is much faster at 0.4 seconds, the file will be 6 MB

plotlyjs() is nothing out of the ordinary, with 0.6 seconds excecution time and a file size of 477 kB.

Ok, so the issue is not the display, but saving in *.svg format.
If you save in *.png format it takes 0.2 s.

1 Like

Your options seem to be for SVG output these backends with Plots.jl:

https://docs.juliaplots.org/latest/output/

gr, inspectdr, pgfplotsx, plotlyjs, pythonplot, gaston

You seem to be looking for a workaround for matplotlib, i.e. PythonPlot:
https://docs.juliaplots.org/stable/backends/#At-a-glance

As mentioned you can use PNG not just SVG with it or most backends:

I doubt SVG can be made fast, it’s just inherently scales linearally to be slower, with more points, or even if there’s some workaround, PlythonPlot (and PyPlot) are limited to what matplotlib does (I suppose they support all the features).

Have you thought of using not just a different backend for Plots, rather use Makie? I think it’s going to be the future of Julia plotting, if not already. It has it’s own backends.

I understand the tidier ecosystem is very nice (a go-to by now for people over using DataFrames directly, and not just from people coming from R), and it has:

It uses Makie as its plotting package.

What do you even mean? Different plots, or same plot with both, and editing into one by hand? That seems unworkable, not a scalable solution, if you need to regenerate the plots at least. I at least thought with Plots.jl you choose just one backend, but I suppose you can switch dynamically if it suits you, it’s just seems like a giant hack to be avoided what you are doing.

1 Like

Except .png rasterizes everything. Text, ticks, axes etc. should be vector graphics. Just the heatmap should be an embedded bitmap, because it’s essentially pixels - that’s exactly what bitmaps are for.
GR does that when I save to svg - heatmap as embedded bitmap, everything else as vector graphics. The problem is, GR messes up the plot by overlapping text, it doesn’t seem to be drawing ticks and it can’t do logarithmic colorbars. Which may be fixable, but requires a lot of hacking.

Yes, you understood that exactly right. It’s a terrible solution, but it’s the only way I seem to be able to generate acceptable heatmaps.

I remember trying makie when fixing my logarithmic colorbar problem - I think that also didn’t work. PythonPlot is the only one that does that well, it does a really good job with line plots, but it sucks for heatmaps. I don’t want to draw plots with different backends for the same document to work around that.

Makie doesn’t seem to be using bitmaps either:

MakieCore.heatmapFunction.
Plots a heatmap as a collection of rectangles

If you rasterize in pythonplot() with dpi=1200, it takes 2.4 s to generate the png file. Is this resolution good enough?

No, I want an SVG file. Only the heatmap itself should be stored as a bitmap. A high resolution bitmap is a terrible solution. It still looks awful (albeit not as awful as a low resolution bitmap) and it takes up far more storage space than necessary.

If it only was excecution time it wouldn’t be half as bad. The problem with SVG-rectangles instead of proper pixels is storage space and appearance. My document would be tens to maybe hundreds of MBs and take forever to render. A high resolution PNG is somewhere in between - still looks terrible, uses a ton of storage, but probably renders at acceptable speeds.

The thing is, I can get exactly what I want with a few clicks in Inkscape. But the process is manual and takes forever. What I want is default behavior with the majority of commercial tools and even default on the GR backend. Even python tools like imshow can do this, but it’s not integrated in Julia Plots, so it’s tedious to use.

This is a little off topic as you wanted to use Plots.jl, not Makie, but you can rasterize the heatmap (while keeping everything else in vector format) with Makie:

using CairoMakie

fig = Figure()
xs = range(-1, 1; length = 301)
ys = xs'

Zs = cos.(2π.*xs).*sin.(2π.*ys)

ax = Axis(fig[1,1]; xlabel = "x-axis", ylabel = "y-axis")
hm = heatmap!(ax, xs, xs, Zs; rasterize = 5)
Colorbar(fig[1,2], hm)

save("hm.pdf", fig)

Filesize: ~234 kB

2 Likes

I there really no way to use pythonplot? All my other plots are already using it, I’d have to redo everything.

I really don’t know the Plots.jl ecosystem at all and haven’t used PyPlot/Matplotlib in a long time, but matplotlib does support rasterization: Rasterization for vector graphics — Matplotlib 3.9.2 documentation. Thus, if you can somehow extract the python objects from your Plots.jl plot it should be possible to directly call the Matplotlib functions on them. No clue on how to do this in any more detail, alas.

1 Like

It’s not obvious to me that set_rasterized or Axes.set_rasterization_zorder is supported in PythonPlot’s API (I looked for it in the code and in PyPlots and in Plots), i.e. that it has complete API coverage, without accessing directly, but such a hack seems possible. Then neither would Plots have full API, so you’ll need to access the underlying object twice from there.

You might want to first try if you can do what use that, i.e. do what you want with matplotlib directly (with PythonCall or in python itself), and if you can then from PythonCall seemingly though PythonPlot.matplotlib:

Exported functions

Only the currently documented matplotlib.pyplot API is exported. To use other functions in the module, you can also call matplotlib.pyplot.foo(...) as pyplot.foo(...). For example, pyplot.plot(x, y) also works. (And the raw Py object for the matplotlib module itself is also accessible as PythonPlot.matplotlib.)

[…]
You must also explicitly qualify some functions built-in Julia functions. In particular, PythonPlot.xcorr, PythonPlot.axes, and PythonPlot.isinteractive must be used to access matplotlib.pyplot.xcorr etcetera.

If you wish to access all of the PyPlot functions exclusively through pyplot.somefunction(...), as is conventional in Python, you can do import PythonPlot as pyplot instead of using PythonPlot.

[PythonPlot is likely a fork of PyPlot.jl why it says PyPlot, not PythonPlot in the docs.]

Note if you use Plots then seemingly you need to use: Plots.backend_object

So possibly like Plots.backend_object.PythonPlot.matplotlib.set_rasterized ?

When looking up zorder in Plots I only found: plotting order in figure · Issue #236 · JuliaPy/PyPlot.jl · GitHub

The following very hacky method appears to work with some caveats. Here’s a simplified example. First run this:

using Plots
import PythonPlot
pythonplot()
xs = range(-1, 1; length = 301)
ys = xs' 
Zs = @. cos(2π*xs)*sin(2π*ys)
ret_val = heatmap(xs, xs, Zs)

Wait for whatever Plots/Python is doing (julia will run the next command too soon otherwise). Then run this:

quad_mesh = ret_val.series_list[1][:serieshandle][1]
quad_mesh.set_rasterized(true)
ret_val.o.savefig("hm.pdf")

The output file is small (~32 kB) with only the heatmap rasterized. For your actual figure, you’ll have to try to hunt down which series is the one you need to extract. If it’s a heatmap, its serieshandle should be a matplotlib.collections.QuadMesh object

1 Like

YES!!! this is exactly what I needed. Quick, easy scalable. Thank you so much!
Now if you also know how to change the zorder of different plot elements, because the heatmap is above the ticks and the grid lines, I would not have to use inkscape anymore.

Nevermind, I found it:
quad_mesh.zorder=-1

Maybe you can help me understand what :serieshandle is. I thought “wait for whatever Plots/Python is doing” meant i could just add a sleep(). But it seems that :serieshandle is only available if I run the second block separately.