Increased allocations when using threads

Thanks for the suggestion. I have seen this section of the docs about BLAS, but I’m not using any of the LinearAlgebra arithmetic, so unfortunately it doesn’t apply to my case.