Don't understand why code runs out of memory and crashes

There is a function task_local_storage() which gives you an IdDict which is local to your task. It can be used to store a vector under some name, e.g. with

v = get!(() -> Vector{Float64}(), task_local_storage(), :myvec)::Vector{Float64}
resize!(v, N)

Here’s a macro which automates away the uglyness, and creates a unique name for the vector:

macro tlscache(type)
    sym = Expr(:quote, gensym("tls"))
    quote
        get!(() -> $(esc(type))(), Base.task_local_storage(), ($sym,$(esc(type))))::$(esc(type))
    end
end

Use as:

x = @tlscache Vector{Float64}
resize!(x, N)
randn!(x)
sort!(x)

By using resize!, the vector is only reallocated if the size N is larger then before. If N is always the same, you allocate only the first time, and you get one vector per task.

2 Likes

Should really use 10^7 instead of messing with floats.

2 Likes

Ran this reproducer and the reproducer from Memory leak with Julia 1.11's GC (discovered in SymbolicRegression.jl) · Issue #56759 · JuliaLang/julia · GitHub with GC.enable_logging. Seems like in both of them, heap size increases, but most of it is mallocd memory (not pool allocated memory), which doesn’t shrink even if you run a GC.

Didn’t investigate further yet to know whether it’s related, but seems suspicious…

See Memory leak with Julia 1.11's GC (discovered in SymbolicRegression.jl) · Issue #56759 · JuliaLang/julia · GitHub

3 Likes

That does seem to be it - I can also see in top that it’s the resident memory that increases. Whether this is a true “memory leak” in the sense that the memory is inaccessible to the GC is debatable (presumable it could be freed & is technically reachable by the GC?), but the point that this increase is unexpected holds.

3 Likes

Not a solution to the GC / memory leak issue, but OhMyThreads.jl has a lot of nice functionality for efficient allocations in multithreaded code.

3 Likes

Wow, thanks @gdalle , I had not seen that, already put to good use. There are so many fantastic Julia packages, (that I sadly miss)

d

1 Like

I am wondering whether this might be related to this Julia issue:

but I can’t reproduce the OOM with your example. According to the issue thread return nothing from the threaded loop might help in reducing the memory problems.

2 Likes

Hello

Thanks for all the replies. We eventually solved this, imperfectly, by writing a shell script that loops over the parameters and calls julia which runs simulation for one parameter set and exits. It works, but is far from elegant. There seems to be no alternative (I can make it take longer to run out of memory, but eventually it will). We asked around and few researchers told us this was the reason they did not use Julia for simulations.

We now have to put our code on a public githup page. It will not be nice to have the shell script work around for the memory leak, and I am contemplating re-writing the code in Python or R which don’t have this memory leak problem, and are about equally fast for this code.

But before I do that, wanted to ask a last time if this memory leak problem was about to be addressed, so we can wait, or if it is here to stay, in which case we have to use Python or R.

all the best, d

It seems that the memory leak were resolved thanks to reproducer you made Memory leak with Julia 1.11's GC (discovered in SymbolicRegression.jl) · Issue #56759 · JuliaLang/julia · GitHub
I consider it to be success on how quickly such a nasty bug was found and resolved. You may be able to try it out by installing Julia nightly now with juliaup or with the next patch releases when it will be available.

3 Likes

Thanks @Janis_Erdmanis

Excellent news. I did not know this was being addressed, and so quickly.

I’ll be sure to give the nightly a spin, and delay putting our code on githup until the next release is out.

Our every interaction with Julia and its community validates using it. Both are truly fantastic.

best, d

3 Likes