Threading primitives have to allocate as well, e.g. the spawned tasks that are used for multithreading.
I’m also not sure if @time and TimerOutputs are suitable for stable allocation tracking, do you see the same behavior with a custom threaded loop and BenchmarkTools?