How difficult is it to write allocation-free code to avoid GC pauses?

robsmith11 · May 27, 2020, 6:13am

I’m doing almost all of my analysis in Julia, so it would be nice if I could avoid rewriting some of my algorithms in another language for production. The problem is that in production I have latency requirements that basically mean I can’t have pauses longer than ~1 millisecond, which is below the time it takes for Julia’s GC to run.

It’s fairly easy for me to write my quantitative algorithms in a way that avoids allocations (I’ve already done this to maximize throughput performance), but not sure how easy it is for the other bits: reading/writing to web socket and writing to a binary log file on disk. Is it possible to use standard Julia libraries or will I need to write custom Julia or C/C++ code?

Can I put the GC in a debug mode to print notifications that it’s been triggered?

EDIT: I should clarify: it’s ok to allocate or GC at startup or when opening new network connections, which stay open for hours. The main requirement is to consistently respond quickly to websocket messages after everything is open.

pixel27 · May 27, 2020, 1:34pm

I feel like this is too open ended. It depends on what you want to use in the standard library and 3rd party libraries. So if you are writing something that doesn’t use any methods you didn’t defined, then it’s “easy” to write code that doesn’t trigger the garbage collector.

If you do want to use a method you didn’t create then you will have to look at the source for that function to see what kind of memory allocations are done. The BenchmarkTools package the @benchmark macro will probably be your friend since it will tell you if that method performed any allocations.

The other option is to use the --track-allocations flag when starting Julia. Once the program is done it will tell you what line allocated memory, and how much.

robsmith11 · May 27, 2020, 1:55pm

Yeah after a bit more thought, I’m pretty sure I’m going to need to write this in C++ or Rust. I’ll be parsing nested JSON arrays, which seems impossible to do without allocating unless I write my own JSON library.

pixel27 · May 27, 2020, 2:15pm

It definitely sounds like Julia doesn’t provide the requirements you want need. Validating that your responses are never delayed by more than 1ms is just not something the garbage collector can guarantee. I haven’t looked at what guarantees other garbage collectors provide but I think any of them can give you that guarantee.

robsmith11 · May 28, 2020, 2:31am

Go’s GC would be plenty fast. They’ve gotten the pauses down below 100-200 microseconds:

dlakelan · May 28, 2020, 4:04pm

Some interesting numbers:

@benchmark begin a=zeros(1000000); a=0; end
BenchmarkTools.Trial: 
  memory estimate:  7.63 MiB
  allocs estimate:  2
  --------------
  minimum time:     800.316 μs (0.00% GC)
  median time:      1.018 ms (0.00% GC)
  mean time:        1.128 ms (7.15% GC)
  maximum time:     2.726 ms (34.31% GC)
  --------------
  samples:          4402
  evals/sample:     1

julia> @benchmark begin a=zeros(1000000); a=0;GC.gc() end
BenchmarkTools.Trial: 
  memory estimate:  7.63 MiB
  allocs estimate:  2
  --------------
  minimum time:     62.242 ms (98.74% GC)
  median time:      65.778 ms (98.66% GC)
  mean time:        66.492 ms (98.70% GC)
  maximum time:     77.300 ms (98.66% GC)
  --------------
  samples:          76
  evals/sample:     1

 @benchmark begin a=zeros(1000000); a=0;GC.gc(false) end
BenchmarkTools.Trial: 
  memory estimate:  7.63 MiB
  allocs estimate:  2
  --------------
  minimum time:     914.574 μs (17.75% GC)
  median time:      1.079 ms (18.79% GC)
  mean time:        1.111 ms (19.31% GC)
  maximum time:     2.636 ms (13.62% GC)
  --------------
  samples:          4472
  evals/sample:     1

So, at least on my desktop machine which isn’t super slow, just allocating 1 million floats takes around 1ms, and obviously doing a full gc on that takes a long time, but doing an incremental gc it adds around 0.2 ms

I don’t know how you’re planning to parse large nested JSON objects and calculate answers and spit them out on a socket faster than my machine can allocate 1M floats but if you can figure that out, it does seem like you might just call an incremental gc after serving each request and it could still be in your soft-real-time budget.

You might try ccall to call an existing JSON parser that does its own memory management, and handle only the quantitative bits in julia.

Your application sounds like high frequency trading or something?

mbauman · May 28, 2020, 4:29pm

The tradition for HFT (as I’ve heard it) is to not worry about memory leaks (or GC) and just restart everything every night. You just have to keep your leaks/GC usage low enough to stay within RAM for 8 hours… with an “easy” solution of buying more RAM as your usage grows.

I know the robotics folks have had lots of success getting Julia to hit hard real-time guarantees without such cheats, but parsing variable-length JSON arrays does make this a bit trickier. That said, I think you could (ab)use JSON3’s internals with pre-allocated “tape” vectors to get a long way there (for a rigorously defined and limited JSON structure).

robsmith11 · May 28, 2020, 4:35pm

The JSON messages aren’t that large, typically 1-2 KB. The calculations I’ll be running are highly optimized and can be run in around 10 microseconds.

robsmith11 · May 28, 2020, 4:40pm

Thanks, I’ll take a look at JSON3’s internals.

I’d really need to get the allocations down to run with GC totally disabled. Just looking at the current bandwidth usage on this server, I’m seeing almost 100 GB of incoming traffic per day from these messages. I could add to the 32 GB of RAM, but probably not worth it.

Topic		Replies	Views
What exactly is "allocation" in Julia? Performance question , memory-allocation	45	6109	November 4, 2022
Triggerring GC unexpectedly New to Julia	9	391	September 5, 2022
Case study: Real time hardware control for adaptive optics with Julia Performance realtime , gc	9	814	July 23, 2024
Julia's GC vs. Go's GC (and more). Good idea to compile Julia to Go? Performance	3	4030	July 21, 2018
GC occurs at the worst time in tight loop (Garbage Collection) Performance question	93	3335	November 7, 2023

How difficult is it to write allocation-free code to avoid GC pauses?

Related topics