I could use a package that could look at a variable and compute the true memory used by the pointed-to data object.
Is there anything like this?
I could use a package that could look at a variable and compute the true memory used by the pointed-to data object.
Is there anything like this?
Sounds just like Base.summarysize
, but are there reasons that’s not enough? PS I have no idea what the chargeall
argument does from its description.
That looks promising. I will check it out. Thanks.
Just beware of adding the numbers reported by summarysize
if you can’t dismiss overlap. Take the example below; you shouldn’t simply add the memory usage by each variable instance because they share 8040 bytes.
julia> x = collect(1:1000); y = Ref(x); z = Ref(x); a = 1+1im;
julia> Base.summarysize(x), Base.summarysize(y), Base.summarysize(z), Base.summarysize(a)
(8040, 8048, 8048, 16)
One thing that might do what you’re looking for is the function varinfo()
:
julia> x = randn(100,100);
julia> y = randn(1000,1000);
julia> varinfo()
name size summary
–––––––––––––––– ––––––––––– –––––––––––––––––––––––––
Base Module
CGFuns 30.709 KiB Module
Core Module
InteractiveUtils 254.409 KiB Module
Main Module
ans 7.629 MiB 1000×1000 Matrix{Float64}
x 78.164 KiB 100×100 Matrix{Float64}
y 7.629 MiB 1000×1000 Matrix{Float64}
I also have this function in my ~/.julia/config/startup.jl
:
readablesize(x) = Base.format_bytes(Base.summarysize(x))
which then prints the size of objects in a nicer format than just summarysize
.
So I just figured out Base.summarysize((x, y, z, a))
counts shared memory once. There is a bit of overhead from Tuple pointers: 8096 (total) = 8040 (x) + 8 (y) + 8 (z) + 16 (a) + 24 (pointers to x,y,z). sizeof((x,y,z,a))
is 40 because the Tuple stores 3 pointers and a copy of the immutable a
, so simply subtracting sizeof(_)
isn’t right. Would need some way to figure out if an element is implemented as a pointer or inline copy; just checking semantic mutability does not always work e.g. String
s are immutable yet not stored inline. (But for some reason, ismutabletype(String)
and ismutabletype(DataType)
return true
.)
Really not sure if there’s a way to do this for all live variables or references without explicitly writing them. There is a Base.gc_live_bytes()
to report all live memory, but that also includes deallocated memory that has yet to be garbage collected and allocations for hidden implementation mechanisms.
It seems to me that the varinfo
function cannot be used inside an arbitrary function to inspect the local variables. That is what I really need. I would like to inspect the sizes of the local variables in this
easily grokkable way.
You could use Base.@locals
+ Base.summarysize
perhaps
Wonderful!
If you’re digging into local variables, just be aware that you could be interfering with optimizations, including those concerning memory usage.
julia> function f()
x = Ref(Int16(1))
y = 3.5
x[]+y
end
f (generic function with 1 method)
julia> function f2()
x = Ref(Int16(1))
xsize = Base.summarysize(x)
y = 3.5
ysize = Base.summarysize(y)
x[]+y
end
f2 (generic function with 1 method)
julia> @code_llvm f()
; @ REPL[1]:1 within `f`
define double @julia_f_172() #0 {
top:
; @ REPL[1]:4 within `f`
ret double 4.500000e+00
}
julia> @code_llvm f2()
; @ REPL[2]:1 within `f2`
define double @julia_f2_184() #0 {
top:
%gcframe7 = alloca [3 x {}*], align 16
### I'll omit the Ref and summarysize parts
; ┌ @ refvalue.jl:56 within `getindex`
; │┌ @ Base.jl:42 within `getproperty`
%19 = load i16, i16* %12, align 2
; └└
; ┌ @ promotion.jl:379 within `+`
; │┌ @ promotion.jl:350 within `promote`
; ││┌ @ promotion.jl:327 within `_promote`
; │││┌ @ number.jl:7 within `convert`
; ││││┌ @ float.jl:146 within `Float64`
%20 = sitofp i16 %19 to double
; │└└└└
; │ @ promotion.jl:379 within `+` @ float.jl:399
%21 = fadd double %20, 3.500000e+00
%22 = load {}*, {}** %5, align 8
%23 = bitcast {}*** %2 to {}**
store {}* %22, {}** %23, align 8
; └
ret double %21
}
PS really not sure why there are the %22 and %23 lines in the addition part of @code_llvm f2()
when it just ends up returning %21
Super neat, I didn’t know about that macro! I’m totally going to add a macro like this into my startup.jl
file:
macro show_locals()
quote
locals = Base.@locals
println("\nIndividual sizes (does not account for overlap):")
for (name, refval) in locals
println("\t$name: $(Base.format_bytes(Base.summarysize(refval)))")
end
print("Joint size: ")
println("$(Base.format_bytes(Base.summarysize(values(locals))))\n")
end
end
# example use:
function tester(n)
x = randn(n,n)
@show_locals
sum(x)
end
With example output:
julia> tester(100)
Individual sizes (does not account for overlap):
n: 8 bytes
x: 78.164 KiB
Joint size: 78.625 KiB
94.23373017998107
I love threads like this where I learn some nifty trick. I was just wishing a month or two ago to do something like this but I didn’t really take the initiative do something about it and try to figure out a solution.
one thing to consider is that if it isn’t easy to eyeball how big the locals are, you probably should split your function up more.
I wonder if @locals
makes copies of the data? Not easy to tell by looking at the code:
macro locals()
return Expr(:locals)
end
This also has overhead, more than the Tuple example it seems:
julia> function f()
x = 1
@show_locals
end
f (generic function with 1 method)
julia> f()
Individual sizes (does not account for overlap):
x: 8 bytes
Joint size: 472 bytes
It’s negligible when your instances are large enough to take up most of the reported memory, like your example with a 100x100 matrix, but that’s not always the case. I’m not sure if there is a way to separate the memory of the values
iterator and the referenced @locals
dictionary from the memory of the contained instances, though. sizeof(locals)
is a constant for any size, so probably have to dig into some internals.
@locals
makes a Dict{Symbol, Any}
, so it would involve how those instances are boxed. Hypothetically the boxes on the heap could just point to the existing instances, but I don’t actually know how boxes are implemented; as far as I know, copying could happen for many immutables. In any case, @locals
or any other container would only contain 1 of the copies, so summarysize
wouldn’t count twice.
I couldn’t find out how to measure dictionary memory, and I wasn’t comfortable with large heterogeneous tuples that can store its elements in various ways, so I went with a simpler Vector{Any}
as the container.
It appears that Base.summarysize
doesn’t actually report all the memory used by a Vector{Any}
. I know there should be boxes containing element type information, but it only seems to count the elements and the vector’s pointers to them. Also, if you allocate a v = collect(1:100000)
then empty!(v)
, the underlying buffer does not shrink if I recall correctly, but sizeof
and summarysize
reports a reduction to minimal memory. So summarysize
might actually be unsuitable for measuring allocated heap memory; maybe we could say it measures the portion of allocated memory that represents accessible data?
Still, this might make it easier to remove the overhead of the Vector{Any}
containing the @locals
instances. Bear in mind the following only worked on a few small examples, I have not rigorously tested this and do not know how.
julia> function valuessize(v::Vector{Any})
(Base.summarysize(v) # doesn't seem to count boxes
- Base.summarysize(Any[]) # Vector overhead
- sizeof(Int)*length(v) ) # element pointers
end
valuessize (generic function with 1 method)
julia> function valuessize(d::Dict{Symbol, Any})
valuessize(collect(values(d)))
end
valuessize (generic function with 2 methods)
julia> x = Dict{Symbol, Any}(:x => 1, :y => 2.3, :z => [3])
Dict{Symbol, Any} with 3 entries:
:y => 2.3
:z => [3]
:x => 1
julia> valuessize(x)
64
julia> sum(Base.summarysize.(values(x))) # elements don't share data
64
julia> Base.summarysize(collect(values(x))) # plus Vector{Any} overhead
128
julia> Base.summarysize(values(x)) # plus values/Dict{Symbol,Any} overhead
528