How to track total memory usage of Julia process over time

Based on @vchuravy suggestions above, I gave it a shot and came up with two implementations in Julia.

The first one meminfo_julia() only uses built-in functions and thus should be relatively portable, while the second one meminfo_procfs() queries the procfs filesystem directly:

using Printf: @printf

function meminfo_julia()
  # @printf "GC total:  %9.3f MiB\n" Base.gc_total_bytes(Base.gc_num())/2^20
  # Total bytes (above) usually underreports, thus I suggest using live bytes (below)
  @printf "GC live:   %9.3f MiB\n" Base.gc_live_bytes()/2^20
  @printf "JIT:       %9.3f MiB\n" Base.jit_total_bytes()/2^20
  @printf "Max. RSS:  %9.3f MiB\n" Sys.maxrss()/2^20
end

function meminfo_procfs(pid=getpid())
  smaps = "/proc/$pid/smaps_rollup"
  if !isfile(smaps)
    error("`$smaps` not found. Maybe you are using an OS without procfs support or with an old kernel.")
  end

  rss = pss = shared = private = 0
  for line in eachline(smaps)
    s = split(line)
    if s[1] == "Rss:"
      rss += parse(Int64, s[2])
    elseif s[1] == "Pss:"
      pss += parse(Int64, s[2])
    elseif s[1] == "Shared_Clean:" || s[1] == "Shared_Dirty:"
      shared += parse(Int64, s[2])
    elseif s[1] == "Private_Clean:" || s[1] == "Private_Dirty:"
      private += parse(Int64, s[2])
    end
  end

  @printf "RSS:       %9.3f MiB\n" rss/2^10
  @printf "┝ shared:  %9.3f MiB\n" shared/2^10
  @printf "┕ private: %9.3f MiB\n" private/2^10
  @printf "PSS:       %9.3f MiB\n" pss/2^10
end

The output from both is as follows:

julia> meminfo_julia()
GC total:     29.837 MiB
GC live:      34.361 MiB
JIT:           0.017 MiB
Max. RSS:    183.168 MiB

julia> meminfo_procfs()
RSS:         190.602 MiB
┝ shared:      3.215 MiB
┕ private:   187.387 MiB
PSS:         187.696 MiB

It seems that the numbers obtained from Julia directly are missing quite a bit of untracked memory, likely due to not taking into account the size of the Julia code itself plus shared libraries loaded by Julia. This is can be verified by, e.g., running a second julia process on the same node and then querying meminfo_procfs() again:

julia> meminfo_procfs()
RSS:         190.605 MiB
┝ shared:     53.977 MiB
┕ private:   136.629 MiB
PSS:         162.266 MiB

In this case, ~50 MiB get shifted from private to shared RSS, i.e., this is approx. the memory required for shared libraries loaded by Julia itself (and not used by any other program running). During the first invocation, this is counted as private (since there is only one program using them), in the second instance it is shared (since two Julia processes are using them now). However, that still leaves ~100 MiB not accounted for (surely this is not the Julia executable alone?).

Thus, when looking at these two particular solutions, it seems like both approaches can give you valuable information from within Julia itself. One advantage of meminfo_julia is that it breaks down memory usage by category, with the downside that total memory use (RSS) is only counted as a maximum over time (i.e., non-decreasing). On the other hand, meminfo_procs can get you real-time information on total (=external memory use), even for other processes than the currently running one. At the same time, it has a much higher performance impact itself (~3ms for meminfo_procfs vs. ~2.8μs for meminfo_julia; numbers with I/O disabled).

EDIT: Fix implementation of meminfo_julia() based on remark by @mkoculak below.

8 Likes