Running task on background thread slows performance and increases jitter

robsmith11 · April 10, 2020, 8:31am

I’ve been testing to see if Julia is appropriate for recording high volumes of market data arriving over the network with precise timestamps.

I ran the simple loop below to simulate parsing strings and to create a little garbage collection overhead. The results are okay when run in the main thread, but if I use @spawn, the performance decreases and (more importantly for my use-case), the maximum execution time for each iteration increase significantly.

Is there any GC tuning I can do to decrease the jitter? Should I file a bug regarding the performance or is this expected?

using Dates

function f()
 x = 0.0
 t0 = now()
 t1 = now()
 t2 = now()
 mdt = t0 - t0
 for _ in 1:10^8
  t1 = now()
  x += parse(Float64, string(rand()))
  t2 = now()
  mdt = max(mdt, t2 - t1)
 end
 println(x)
 println("Total time: ", t2-t0)
 println("Maximum step time: ", mdt)
end

f()
f()
Threads.@spawn f()

julia> include("/tmp/t.jl")
4.9995243654819675e7
Total time: 40740 milliseconds
Maximum step time: 1 millisecond

5.000607598882865e7
Total time: 40722 milliseconds
Maximum step time: 1 millisecond

Task (runnable) @0x00007fc22c0ec4f0
5.0000995346553124e7
Total time: 53772 milliseconds
Maximum step time: 8 milliseconds

julia> versioninfo()
Julia Version 1.5.0-DEV.609
Commit 8a55a27ea7 (2020-04-10 01:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 8

EDIT:
I ran the same test with my current software solution (kdb+/q) on the same server and the overall performance is 2.5x slower than Julia, but the maximum jitter is only 0.2 milliseconds.

pixel27 · April 10, 2020, 2:28pm

parse(Float64, string(rand())) is going to be creating your garbage. I don’t think the issue is @spawn, try reversing your runs:

task = Threads.@spawn f()
wait(task)
f()
f()

I suspect the garbage collector just happens to kick in during the @Threads.@spawn execution. Or you can try something like:

GC.gc()
f()
GC.gc()
f()
GC.gc()
Threads.@spawn f()

Just to ensure that the previous run isn’t causing issues with the next run…

robsmith11 · April 11, 2020, 12:34am

You’re right. I had thought because all 3 runs triggered GC that it would be a fair comparison, but it seems that after several runs, a bigger, and much slower GC is triggered.

Topic		Replies	Views
Confusion regarding Threads.@spawn performance General Usage	4	1085	July 19, 2020
Huge performance drop using @spawn General Usage performance	3	853	January 11, 2020
[ANN] ThreadPools.jl - Improved thread management for background and nonuniform tasks Package Announcements	50	3964	February 29, 2020
Work on main thread delays spawned task General Usage multithreading	3	523	January 27, 2022
Thread overhead variability across machines Performance	13	1905	November 28, 2017

Running task on background thread slows performance and increases jitter

Related topics