Just a guess, but maybe this is performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub once again (the @threads
macro creates a closure). See also see Parallelizing for loop in the computation of a gradient - #7 by tkoolen. Check the code_warntype
.