C implementation of function being ~4 times faster even absence of allocs

Palli · March 5, 2025, 10:19am

So now it’s way faster than that C code (why?), but I’m not even sure it’s the limit of speedup.

I was thinking a naive_findmax could be Base, or under findmax with keyword argument (and every time Base isn’t fastest for something, explain faster option available, without taking edge cases into account). But this seems to be a deep rabbit hole and maybe even faster here:

At least document that findmax/min aren’t fastest, and point to alternatives (if not simply adding fastest) likely FindFirstFunctions.jl in their docstrings?

Topic		Replies	Views
Performance of findmax vs. raw loop General Usage	12	848	July 19, 2020
Faster `min`,`max` for `Float64\|32` Internals & Design performance	38	2148	August 17, 2021
Poor performance due to memory allocations? Performance memory-allocation	17	2852	January 15, 2019
Julia Beginner (from Python): Numba outperforms Julia in rewrite. Any tips to improve performance? Performance benchmark , python , tullio , loopvectorization	56	6232	August 18, 2021
Replicate @tturbo performance Performance	23	2430	August 23, 2022

C implementation of function being ~4 times faster even absence of allocs

Related topics