A macro to unroll by hand but not by hand?

Elrod · January 15, 2022, 9:37pm

I actually already get segfaults for M=10.
If I increase ThreadingUtilites.THREADBUFFERSIZE to 1024, it works fine.

Basically, @tturbo is trying to store more variables in the buffer than actually fit, causing the crashes.
For each variable, it requires a number of bytes equal to the SIMD width. That is 64 for AVX512, or 32 for AVX2.
It also needs some of the storage for other things, so I’d expect only 7 variables for AVX512 and 15 for AVX2. I’d have to double check, but if it’s already crashing for M=15, I guess it’s using a bit more than that.
You could increase the buffer.
Or maybe, because it’s possible to check at compile time if it fits, it could use a check itself.

The fundamental problem with LoopVectorization.jl here is that it currently only handles “loop independent” dependencies and not “loop carried” dependencies aside from reductions.
Loop independent dependencies are those that only within a loop iteration, and thus that don’t stop you from reordering the iterations arbitrarily.
Loop carried are those between iterations.

If it handled loop carried dependencies, you could write the loading/storing to rez as part of the loop and still get correct answers. It should then also be able to optimize the situation better when M is large (by not unrolling it entirely).
I’m working on support for this, but I’d expect it to take a long time, as it is part of a ground up rewrite of the library.

Topic		Replies	Views
Loop unrolling, type param to macro, generated functions General Usage question , unrolling	4	2433	April 20, 2017
@tturbo on function call Performance question , loopvectorization , tturbo	4	172	February 24, 2025
Replicate @tturbo performance Performance	23	2421	August 23, 2022
Unrolling loops over tuples - why so hard? Performance tuple , unrolling	14	1333	September 10, 2023
Optimisations for loops of known size at compile time Internals & Design question , unrolling	3	604	March 25, 2022

A macro to unroll by hand but not by hand?

Related topics