In
Chris Elrod had suggested that the differences arise from non-temporal stores, and that LoopVectorization provides a Julia equivalent.