Why is the memory blowing up in this multi-threaded code?

ruthmarx1 · April 3, 2019, 8:25am

I have some multi-threaded code in which each thread calls a function f(df::DataFrame) which reads a column of that DataFrame and finds the indices where the column is greater than 0:

function f(df::DataFrame)
    X = df[:time]
    indices = findall(X .> 0)
    indices
end

Inside the main thread I read in an R *.rds file which Julia converts to a DataFrame which I’m passing to f() as follows:

rds = "blabla.rds"
objs = load(rds);

params = collect(0.5:0.005:0.7)

for i in 1:length(objs)
    cols = [string(name) for name in names(objs.data[i]) if occursin("bla",string(name))]
    hypers = [(a,b) for a in cols, b in params] # length ~2000
    Threads.@threads for hi in 1:length(hypers) # MEMORY BLOWS UP HERE
        df = f(df)
    end
end

Each df that is passed to f() is roughly 0.7GB. Analysing the memory usage when the multi-threaded loop is run, the memory usage goes up to ~30GB. There are 25 threads and ~2000 calls to f() . Any idea why the memory is exploding?

DNF · April 3, 2019, 9:55am

Cross-reference: Julia: Why is the memory blowing up inside this loop? - Stack Overflow

(Just so you know, it’s considered good etiquette to mention it if you have posted the question other places.)

DNF · April 3, 2019, 9:59am

Try re-writing like this:

function f(df::DataFrame)
    X = df[:time]
    return findall(x->x>0, X)
end

function foo(objs)
    params = 0.5:0.005:0.7 # don't collect
    for i in 1:length(objs)
        cols = [string(name) for name in names(objs.data[i]) if occursin("bla",string(name))]
        hypers = [(a,b) for a in cols, b in params] # length ~2000
        Threads.@threads for hi in 1:length(hypers) # MEMORY BLOWS UP HERE
            df = f(df)
        end
    end
end

using BenchmarkTools

rds = "blabla.rds"
objs = load(rds);
@benchmark foo($objs)

kristoffer.carlsson · April 3, 2019, 10:06am

Instead of one thread allocating, you have 20 threads allocating. FWIW, I have almost never seen a speedup from multithreading if the part that is running parallel is allocating. The GC runs on one thread so if you have 20 threads allocating, it will bottleneck things.

ruthmarx1 · April 3, 2019, 1:25pm

BenchmarkTools.Trial: 
  memory estimate:  35.09 GiB
  allocs estimate:  59882
  --------------
  minimum time:     2.778 s (0.00% GC)
  median time:      2.801 s (0.00% GC)
  mean time:        2.801 s (0.00% GC)
  maximum time:     2.825 s (0.00% GC)
  --------------
  samples:          2
  evals/sample:     1

DNF · April 3, 2019, 1:50pm

So is this threaded or not? Any chance you can provide a MWE? Right now we’re mainly guessing.

ruthmarx1 · April 3, 2019, 1:55pm

rds = "bla.rds"
objs = load(rds);

function f(df::DataFrame)
    X = df[:time]
    return findall(x->x>0, X)
end

function foo(objs)
    for i in 1:length(objs)
        df = objs.data[i]
        Threads.@threads for hi in 1:2000
            f(df)
        end
    end
end

@benchmark(foo($objs))

DNF · April 3, 2019, 1:56pm

Thanks, but we don’t have any data to run this on. A toy dataset would work.

What sort of memory use are you expecting? How big is the data set?

ruthmarx1 · April 3, 2019, 1:57pm

objs is around 1GB.

DNF · April 3, 2019, 1:58pm

So since you iterate 2000 times, conceivably you would expect anywhere up to 2TB of memory use?

ruthmarx1 · April 3, 2019, 2:00pm

OK. I guess I’m asking why is the dataframe not being shared across threads? I thought it’s passed by reference?

ruthmarx1 · April 3, 2019, 2:01pm

The same idea in Numba does not blow up the memory.

DNF · April 3, 2019, 2:03pm

I guess it is, but you have a bunch of index arrays being allocated in your loop.

Some toy data with Numba and Julia code would be a big help.

(Gotta run)

Rui · April 3, 2019, 2:39pm

Interesting. Do you know if it is possible and planned to make it the case that each thread handles its garbage?

DNF · April 3, 2019, 4:08pm

Is there really anything unexpected going on here, at all? To me it seems like a code that allocates one Vector of indices per iteration. The memory use is the size of the index vector times the number of iterations. If you want lower memory consumption you need to reuse memory between iterations.

What do the Numba benchmarks look like? Are you sure they are reporting the same thing as here?

Cutting out the dataframe and just working with vectors give me the same memory use.

ruthmarx1 · April 4, 2019, 7:40am

What’s a good way to get equivalent benchmarks in Numba?

ruthmarx1 · April 4, 2019, 7:47am

Also, I ran foo() with just a single array, and the memory is still blowing up:

function f(time::Array{Float64,1})
    return findall(x->x>0, time)
end

function foo(time::Array{Float64,1})
    Threads.@threads for hi in 1:2000
        f(time)
    end
end

@benchmark(foo($objs.data[1][:time]))

The array is 1631339 rows.

I’m running the equivalent code in Numba as follows:

@jit(nopython=True)
def f(time):
    return np.where(time > 0)[0]

@jit(nopython=True, parallel=True)
def foo(time):
    for h in prange(2000):
        f(time)

foo(df['time'].values)

I’m not sure how to get the equivalent of @benchmark in Numba, but looking at htop, the memory doesn’t blow up anywhere near what it does with Julia.

This is not a jupyter notebook thing either; running a pure julia script gives the same result.

kristoffer.carlsson · April 4, 2019, 7:59am

What do you mean by this? The number reported is the sum of all allocations done during the execution. So if you disabled the GC completely you would have 32 GB of memory allocated. Fortunately, we do have a GC.

ruthmarx1 · April 4, 2019, 8:07am

The RES memory in htop increases and doesn’t go back to what it was before I ran the function. I’m not seeing this in Numba.

kristoffer.carlsson · April 4, 2019, 8:20am

Could you post your code as text instead of images so that people that want to try it out don’t have to type the whole thing in manually?

Topic		Replies	Views
Threads.@threads memory leak General Usage	6	1913	March 28, 2019
Memory blow-up when passing DataFrame to function inside @threads loop Julia at Scale	1	558	April 2, 2019
DataFrame type instability - Unneccesary memory allocation Data	1	502	March 28, 2019
Threads maxing out all cores, but no performance increase General Usage performance , threads	16	1882	April 6, 2021
Appropiate use of Threads.@threads? New to Julia question , multithreading	6	1861	June 27, 2017

Why is the memory blowing up in this multi-threaded code?

Related topics