More pressure on garbage collector



is it possible to manually increase pressure on Garbage Collector? I have several workers whose job is to download large files from S3, load them, and preprocess such that the master worker does not need to deal with this. But Julia crashes because all the memory is consumed, but this takes quite some time. I suspect that this is because garbage collector is not executed in time due to distributed environment. Can I somehow verify this idea?

Thanks for answers,


You can use gc() to force the garbage collection, however, I don’t think that this is the root of your problem. Have you benchmarked your scripts to check the memory usage and peaks?

help?> gc
search: gc gcd gcdx gc_enable eigvecs eigfact eigfact! logspace getsockname


  Perform garbage collection. This should not generally be used.


I do not really know, how to construct such a benchmark.


@time would be a good start. Since the job is time-consuming, I would not recommend something more systematic like


Thanks, I will resort to memory hunting.
Is it possible to see the size of the object?


sizeof should be doing the job for you. However, if you are using a custom datatype, you might need to overload Base.sizeof for your types, since normal sizeof will report wrong values in the sense that they will give you the pointer sizes for your architecture if you are using fields such as A::Matrix{Float64}, etc. See below for the example:

julia> A = randn(5,5);

julia> sizeof(A) # 5x5x8 bytes for Float64

julia> struct MyType

julia> obj = MyType(A);

julia> sizeof(obj) # 8 bytes on 64-bit architectures (i.e., `A` is simply a pointer)

julia> import Base.sizeof

julia> function sizeof(obj::MyType)
         res = 0
         for field in fieldnames(obj)
           res += sizeof(getfield(obj, field))
         return res
sizeof (generic function with 8 methods)

julia> sizeof(obj) # now we have correct value


See Base.summarysize.


Thanks a lot for help. This seems to be nicely clear to me.