Here’s the report of me tracking memory usage in Pluto:
$ julia
# 90MB or so
> using Pluto
# 175Mb or so
> Pluto.run()
#220MB or so
I opened my Pluto notebook in a text editor and commented everything out, then slowly added back in one cell at a time. I’m reporting memory after running each cell. Firefox memory disagrees strongly between Firefox’s about:performance and my Activity Monitor. I’m going to report Firefox’s memory because I am not 100% sure what’s going on with the FirefoxCP Web Content processes:
> opening pluto window
# memory is ~300MB on two julia processes and 30MB in firefox
> import Distances
# nothing changed, same memory usage
> define some distance related functions
# no change in memory (phew, that would have been strange)
> import CSV, DataFrames, Tables, Glob
# one julia process went from 300MB -> 380MB , firefox at 30MB
> import MLJ, Distributions, Clustering
# now have 1 julia process with 500MB, another with 300MB, and firefox still at 30MB (although 120MB in the activity monitor, I think)
Loading libraries accounts for 500MB of my 4GB in RAM that I was seeing. This seems like a lot!
Now to actually do something:
> load data from CSV files, 3000x50 ish (takes about 20s)
# julia process 520MB, 310MB, firefox still at 30MB
> build dataframe, separate features/labels
# julia 530MB, 310MB, firefox 30MB
Machine Learning related activities test:
> separate train/test data
# no change
> load classifier and make machine
# julia 575Mb, 310Mb, firefox no change
> fit & predict
# juila 600MB, 310Mb, firefox no change (although activity monitor shows the tab process taking 140MB
So comparatively loading the initial libraries took lots of memory and the actual work i’m doing with the matrices comes out to less than 100MB.
Clustering Activities
> compute full 3000x3000 distance matrix
# juila 700MB, 300MB, firefox still 30MB although Activity Monitor reports Firefox taking 1.5GB overall and 140MB for the process I think Pluto's tab is on.
> build clusters with Clusters.hclust
# julia 710MB, 300MB firefox still 30MB but note caveat above
> computing Clustering.vmeasure
# no change, thank goodness
This didn’t bust anything either. The largest 500MB memory usage chunk still seems to be from loading libraries. The only other thing I’ve done is plotting, so my memory woes are probably that, let me just make sure:
> import StatsPlots, set default to png (takes almost 3m)
# juila #1 750MB + 350MB of compressed memory now, #2 310MB, firefox says 41mb for the tab, activity monitor suggests 150MB, and Firefox itself is taking 1GB
> StatsPlots.plot(clusts) (takes 50s)
# julia 950MB + 200MB compressed, #2 300MB, firefox says 41MB for the tab, although activity monitor is up to 200MB
Oof, I think I need an alternative to StatsPlots if it’s taking 400MB just to plot a dendrogram…
This is the current state:
The other plot was a distance matrix heatmap, let me do that too:
> import GR; GR.heatmap(distance_matrix) # for speed,
# julia #1 up to 1.14GB + 431MB compressed, #2 300MB + 115MB compressed, Firefox still reports 33MB, activity monitor shows tab taking 175MB
> import WGLMakie (replace GR, so that library should be removed by Pluto) set format to png (takes 200s)
julia #1 up to 1.1GB + 800MB compressed, #2 330MB + 300MB compressed, Firefox tab 36.3MB, activity monitor 215MB
> WGLMakie.heatmap(distance_matrix)
julia #1 up to 1.64GB + 175MB compressed, #2 about the same, Firefox stats same as above
I start to see Firefox say: “A webpage is slowing down your browser, what would you like to do?”
Firefox stubbornly claims the tab is taking only 40MB still which I think is wrong given what system is saying.
Then I try replacing WGLMakie with GLMakie, and julia’s memory jumps to 2GB - I can see how just evaluating more plots will continue to pile up on julia’s memory usage.
As a test, run GC.gc()
. After:
So the conclusion is that 500MB is coming from dependencies, 200MB is from my matrices and dataframes, and the rest (~1GB) is all the plotting infrastructure - loading plotting and using the routines.