More pressure on garbage collector

Tomas_Pevny · October 17, 2017, 7:41am

Hello,

is it possible to manually increase pressure on Garbage Collector? I have several workers whose job is to download large files from S3, load them, and preprocess such that the master worker does not need to deal with this. But Julia crashes because all the memory is consumed, but this takes quite some time. I suspect that this is because garbage collector is not executed in time due to distributed environment. Can I somehow verify this idea?

Thanks for answers,
Tomas

tamasgal · October 17, 2017, 8:01am

You can use gc() to force the garbage collection, however, I don’t think that this is the root of your problem. Have you benchmarked your scripts to check the memory usage and peaks?

help?> gc
search: gc gcd gcdx gc_enable eigvecs eigfact eigfact! logspace getsockname

  gc()

  Perform garbage collection. This should not generally be used.

Tomas_Pevny · October 17, 2017, 8:07am

I do not really know, how to construct such a benchmark.

Tamas_Papp · October 17, 2017, 8:11am

@time would be a good start. Since the job is time-consuming, I would not recommend something more systematic like

Tomas_Pevny · October 17, 2017, 8:16am

Thanks, I will resort to memory hunting.
Is it possible to see the size of the object?

anon61610682 · October 17, 2017, 9:01am

sizeof should be doing the job for you. However, if you are using a custom datatype, you might need to overload Base.sizeof for your types, since normal sizeof will report wrong values in the sense that they will give you the pointer sizes for your architecture if you are using fields such as A::Matrix{Float64}, etc. See below for the example:

julia> A = randn(5,5);

julia> sizeof(A) # 5x5x8 bytes for Float64
200

julia> struct MyType
         A::Matrix{Float64}
       end

julia> obj = MyType(A);

julia> sizeof(obj) # 8 bytes on 64-bit architectures (i.e., `A` is simply a pointer)
8

julia> import Base.sizeof

julia> function sizeof(obj::MyType)
         res = 0
         for field in fieldnames(obj)
           res += sizeof(getfield(obj, field))
         end
         return res
       end
sizeof (generic function with 8 methods)

julia> sizeof(obj) # now we have correct value
200

Tamas_Papp · October 17, 2017, 9:06am

See Base.summarysize.

Tomas_Pevny · October 17, 2017, 9:34am

Thanks a lot for help. This seems to be nicely clear to me.

LeoK987 · August 29, 2018, 2:30am

@Tomas_Pevny, would you kindly update your findings about the memory leak? I find I have similar issue related to reading files many times.

Tomas_Pevny · August 29, 2018, 7:34am

On the end, I have found two bugs in my code:

I was repeatedly using eval to introduce functions, which slightly bloated Julia’s dictionary of functions
I was not closing stream from Transcoding streams.

Finding these two bugs was quite tedious, as I was cutting my code in halves to identify source of the problem.

LeoK987 · August 29, 2018, 8:42am

Thanks very much for sharing.

Topic		Replies	Views
Increase the ‘aggressiveness’ of the garbage collection? General Usage	9	715	December 19, 2022
Garbage collector behaviour when memory is almost full Performance	7	2225	June 24, 2021
GC occurs at the worst time in tight loop (Garbage Collection) Performance question	93	3282	November 7, 2023
Memory usage New to Julia memory-allocation	3	690	February 7, 2024
Garbage collection New to Julia question	10	5635	June 15, 2020

More pressure on garbage collector

Related topics