Tooling for line-by-line memory allocation analysis

I am tracking down an allocation problem in a large codebase. I generated the allocation information as described in the manual, and managed to get all those MallocInfo objects, but I am unsure what to do next.

That is, I could just manually look up line numbers and track down the biggest culprits, but I am wondering if there is tooling to help me visualize the information I have obtained coupled to the source. I am on Linux, Julia 1.7.

10 Likes

Such a tool would be great.
FWIW Rstudio provides this for the time profiling (little bars in front of source code lines). I do not remember if it also the case for memory.

2 Likes

This exists in Juno as well:

https://docs.junolab.org/stable/man/juno_frontend/#Juno.@profiler

but as you say it’s for time spent on each line, not memory allocated. Afaik VSCode will get a similar flamegraph viewer soon (:tm:), but I haven’t seen anythong on the allocation side.

I went back to using

Perhaps I am barking up the wrong tree, do people analyze allocations with a different tool? I am not talking about macros for single calls like @allocated, but something systematic, analogous to profiling.

2 Likes

This PR seems related: Allocation profiler by vilterp Β· Pull Request #42768 Β· JuliaLang/julia Β· GitHub

Like you, this is what I use to profile allocations (provided that the analyzed code is not too large).

I have a macro lying somewhere in a file on my hard drive, that automatically annotates each line of code (instead of letting you call @timeit at each line).

A single macro call sets everything up to individually time each source code line:

# File foo.jl

@timeit_all to function foo(n)
    for i = 1:n
        begin
            A = randn(100,100,20)
            m = maximum(A)
        end
        if i < 10
            Am = mapslices(sum, A; dims=2)
            B = A[:,:,5]
            Bsort = mapslices(B; dims=1) do col
                sort(col)
            end
        elseif i < 15
            sleep(0.01)
        else
            sleep(0.02)
        end
        let j
            j = i
            while j<5
                b = rand(100)
                C = B.*b
                j += 1
            end
        end
    end

    sleep(0.5)
end

and the timer output looks like this:

julia> const to = TimerOutput()
julia> foo(20)
julia> println(to)
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────
                                                                  Time                   Allocations      
                                                          ──────────────────────   ───────────────────────
                     Tot / % measured:                         744ms / 99.0%           38.7MiB / 100%     

 Section                                          ncalls     time   %tot     avg     alloc   %tot      avg
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────
 foo.jl:3   function foo(n)                           1    737ms   100%   737ms   38.7MiB  100%   38.7MiB
   foo.jl:30  sleep(0.5)                              1    502ms  68.1%   502ms      144B  0.00%     144B
   foo.jl:4   for i = 1:n                             1    235ms  31.9%   235ms   38.7MiB  100%   38.7MiB
     foo.jl:18  sleep(0.02)                           6    127ms  17.3%  21.2ms   1.14KiB  0.00%     195B
     foo.jl:16  sleep(0.01)                           5   56.0ms  7.60%  11.2ms      720B  0.00%     144B
     foo.jl:5   begin                                20   36.5ms  4.95%  1.82ms   30.5MiB  78.9%  1.53MiB
       foo.jl:6   A = randn(100, 100, 20)            20   33.1ms  4.49%  1.66ms   30.5MiB  78.9%  1.53MiB
       foo.jl:7   m = maximum(A)                     20   3.31ms  0.45%   165ΞΌs     0.00B  0.00%    0.00B
     foo.jl:10  Am = mapslices(sum, A; dims =         9   12.3ms  1.68%  1.37ms   5.10MiB  13.2%   581KiB
     foo.jl:12  Bsort = mapslices(B; dims = 1)        9   2.61ms  0.35%   290ΞΌs   1.60MiB  4.13%   182KiB
     foo.jl:11  B = A[:, :, 5]                        9    123ΞΌs  0.02%  13.7ΞΌs    704KiB  1.78%  78.2KiB
     foo.jl:20  let j                                20    116ΞΌs  0.02%  5.82ΞΌs    794KiB  2.00%  39.7KiB
       foo.jl:22  while j < 5                        20   85.3ΞΌs  0.01%  4.26ΞΌs    793KiB  2.00%  39.6KiB
         foo.jl:24  C = B .* b                       10   65.1ΞΌs  0.01%  6.51ΞΌs    782KiB  1.97%  78.2KiB
         foo.jl:23  b = rand(100)                    10   5.34ΞΌs  0.00%   534ns   8.75KiB  0.02%     896B
         foo.jl:25  j += 1                           10   1.69ΞΌs  0.00%   169ns     0.00B  0.00%    0.00B
       foo.jl:21  j = i                              20   8.65ΞΌs  0.00%   433ns     0.00B  0.00%    0.00B
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────

If you think it is of some general interest, please let me know and we can try pushing this as a new feature of TimerOutputs

10 Likes

Conceptually… it looks pretty easy to display this in front of the source code like this:

(from https://support.rstudio.com/hc/en-us/articles/218221837-Profiling-R-code-with-the-RStudio-IDE)

… conceptually :wink:

4 Likes

yes, this would be a great addition, please make a PR if you can.

4 Likes

It would be a useful feature