Autovectorization in Julia 101

I’m trying to read up on autovectorization but I might be looking in the wrong place:
Auto-Vectorization in LLVM — LLVM 20.0.0git documentation
For one, the whole thing talks about Clang and C code, so I have doubts it’s relevant to Julia. I don’t know the LLVM compiler pipeline all that well to begin with.

If I am indeed looking in the right place and this is relevant to Julia, then I have a few questions on top of the big “what even is autovectorization doing in Julia 101”:

Many loops cannot be vectorized including loops with complicated control flow, unvectorizable types, and unvectorizable calls.

This seems consistent with @simd, but…

The Loop Vectorizer is able to “flatten” the IF statement in the code and generate a single stream of instructions. The Loop Vectorizer supports any control flow in the innermost loop. The innermost loop may contain complex nesting of IFs, ELSEs and even GOTOs.

…it sounds like somewhat complex branching can be flattened. Why is @simd stricter?

The loop vectorizer uses a cost model to decide on the optimal vectorization factor and unroll factor.

Isn’t this what LoopVectorization.jl does? I assumed LLVM wasn’t doing this for us, hence our need for that package.

If I remember correctly, I think the author of LoopVectorization.jl one time on this forum said that the cost model of LLVM is simplistic and LoopVectorization.jl does a lot more for assessing the cost better

cc @Elrod