Threads memory allocations

squirrel · January 22, 2020, 8:07pm

Hi. I’m really trying to understand parallel programming in Julia. Can someone please explain to me why using threads requires so many memory allocations?

using BenchmarkTools
using .Threads

function test_serial(u)
    for i in eachindex(u)
        u[i] = threadid()
    end
end

function test_threads(u)
    @threads for i in eachindex(u)
        u[i] = threadid()
    end
end

u = zeros(Int64, 1000)

julia> @benchmark test_serial($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     336.647 ns (0.00% GC)
  median time:      347.059 ns (0.00% GC)
  mean time:        373.924 ns (0.00% GC)
  maximum time:     922.167 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     221

julia> @benchmark test_threads($u)
BenchmarkTools.Trial:
  memory estimate:  2.72 KiB
  allocs estimate:  29
  --------------
  minimum time:     8.099 μs (0.00% GC)
  median time:      9.700 μs (0.00% GC)
  mean time:        10.360 μs (0.00% GC)
  maximum time:     136.400 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

affans · January 22, 2020, 8:55pm

I am not sure if that’s “so many allocations”, but as far as I know, this is just the overhead of using the parallel threading library.

pbayer · January 22, 2020, 9:03pm

The for loop allocates tasks and fetches the result, thus it essentially it reduces to:

julia> @benchmark threadid()
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     4.150 ns (0.00% GC)
  median time:      4.282 ns (0.00% GC)
  mean time:        4.453 ns (0.00% GC)
  maximum time:     37.888 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

vs

@benchmark ((t = @async threadid()); fetch(t))
BenchmarkTools.Trial: 
  memory estimate:  704 bytes
  allocs estimate:  7
  --------------
  minimum time:     14.258 μs (0.00% GC)
  median time:      16.535 μs (0.00% GC)
  mean time:        18.000 μs (0.00% GC)
  maximum time:     97.452 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

times your threads for the memory estimate

Topic		Replies	Views
Increased allocations when using threads Performance question	20	276	July 11, 2024
Problem on benchmarking multi-thread code Performance	3	385	February 10, 2021
Multithreading increases memory allocations Performance	3	284	September 28, 2023
Large memory allocation when a loop is threaded, none when run single-threaded General Usage multithreading , memory-allocation	12	443	October 11, 2021
Memory allocation in multi-thread vs single-thread Julia at Scale performance	0	634	August 7, 2018

Threads memory allocations

Related topics