Trying to plot a heatmap of a distance matrix that is 2500x2500 give or take.
This takes almost a minute to complete using Plots.jl and GR as the backend, and it breaks pluto/plotlyjs when i try it interactively.
In mathematica i can do a matrixplot of a 2000x2000 matrix in less than a second … seems like in julia doing the computation is fast, showing it is slow and in mma doing the computation is slow and showing it is fast
i’m figuring i got something wrong - i read some stuff on the viz category; couldn’t find anything that talked about very large matrixplots.
You can get the behavior I’m talking about by simply doing:
> using StatsPlots
> @time heatmap(randn((2500,2500)))
0.139569 seconds (11.03 k allocations: 61.574 MiB)
Except that time is a lie because I counted 48 seconds before the window with GKSTerm showed me anything and the cursor for julia prompt came back.
I’ve been running this heatmap computation over and over again in the terminal with different matrices and getting similar run times each time.
my workflow is
change distance function
compute distance matrix
plot matrix
i just reran the command to make sure before writing this and got a similar (45s) time to see the matrix plot. Again, i’m probably doing something dumb, just not clear what.
thank you - i think using GR directly is it - or at least it’s fast enough for good iterative workflow
the great news is that this works in pluto directly too (i guess b/c it’s rendering a raster image instead of trying to draw 40,000,000 rectangles in svg and there’s no interactivity?) so i can go back to working iteratively
For the pluto case, i don’t know if it works the same way as jupyter, but you could try doing
using Plots
default(fmt=:png)
In jupyter it breaks because it is drawing an svg that is extremely heavy, using png will avoid that problem (I don’t think it will speed up the REPL, though).
I think that is true, but it is still worth it to try and fix it in Plots. If matplotlib can do it in 1/20th of the time Plots takes using the cpu, there may be something wrong on our side.
Is there a way to mark multiple solutions?
the default(fmt=:png) seems to have sped things up in pluto as well although not as much as using GR directly. I didn’t want to install another distribution of python on my laptop so staying away from matplotlib/python for now, although I do appreciate the speed.
How do you explain GR allocating 1 GB of memory vs matplotlib allocating only 2KB!? that seems like an enormous difference for the same task.
It would be nice if (Stats)Plots could have this be fast too so I can take advantage of the nice documentation and interface… as a new julian GR’s direct documentation is hard to understand.
I think the best solution I can see right now for fast matrix plot with useable API without installing conda python is GRUtils… in case that helps anyone else in my position.
Regardless of the backend, try to set gr(format=:png) to avoid vector graphics in the notebook (the default is :svg). I do this trick all the time in my tutorials and lectures because some students have old machines.