Is it possible to know the distance to the theoretical performance limit for a function? The limit could be based on hardware, a fast benchmark language like C, or Julia itself.

The reason I ask is that sometimes I wish I could have a target to aim at and know where I am. For example, if my target is to get 80% of the theoretical performance limit, then I know exactly when to stop and can better allocate my time.

Performance is a lot more complicated than you think. Writing something in C does not automatically guarantee “optimal” performance, not even close.

Different implementations in C of the “same” algorithm (the same number of arithmetic operations) for the same function can have performance differing by orders of magnitude. See for example, these notes on matrix-multiplication performance or these FFT benchmark results (all of which are in C or Fortran and all of which have the same number of arithmetic operations within ≈25%).

The best thing you can do, in my experience, is to scour the web for other highly optimized implementations of the same (or similar) functions and benchmark them (as fairly as you can, which can take some effort).

Of course, you should also read the Julia performance tips and avoid the obvious pitfalls, like excessive allocations.