Keep in mind that I measured this tiny function to be the largest bottleneck in my 12,000 line codebase (how I wish this was purely for entertainment) But otherwise I completely agree!
Thanks for the info about fma not being available on some systems, I will look into it. It could be why the GitHub actions did not actually seem as fast as on my laptop.