Disable fast-math for a specific function


I’ve written a large numeric simulation that benefits substantially from starting Julia with --math-mode=fast (79% speed-up). I know this flag can be dangerous, but up until now, everything has worked well.

I’d now like to add Bayesian Optimization to tune a few parameters. Unfortunately, the Gaussian Process MLE breaks when I use explicit parameter bounds with fast-math enabled. Performance is not important for the Gaussian Process – I just need it to work.

Is there a way to disable fast-math for only the call to the Bayesian Optimization function?

I’ve tried switching to explicit @fastmath in the obvious parts of my code, but I still get a 43% speed-up by using the global --math-mode=fast flag.


I don’t know of a way to do this easily. Looking at codegen (intrinsics.cpp), it seems that codegen just uses the global flags which are set when julia starts. And the @fastmath macro is a strictly local transformation.

Perhaps the best you can do here is to start two julia instances; one to do the outer loop with Bayesian optimization (fastmath off) and another (or several) worker processes to run the simulation with fastmath on? This might be a good setup in any case, as with multiple cores/machines you’ll then be able to run several simulations in parallel to feed back into the Bayesian optimization.


Explicit @fastmath seems like by far the best option to me, if you could achieve about the same performance. I’m curious why you can’t… is it because --math-mode=fast applies to code within Base and other packages that you can’t reach with @fastmath?


I’ll need to profile more carefully to see where the differences are. I haven’t tried @fastmath with every function yet, so it’s possible I’ve missed some, or like you said, there’s some code in another package that is affected.

I will be using remote workers, so I may just do that: run a master without fast-math and have all the slaves use fast-math.


The problem with @fastmath is that it doesn’t compose so it could be a pretty big burden to add it enough places, especially if using external libraries.


Would be interesting to profile with and without fastmath and see if there is something specific that sticks out.


Almost all optimizations that fastmath allows can be done manually by some combination of:

  1. explicit algebraic simplification of expressions,
  2. @simd annotations to allow floating-point re-association across loop iterations,
  3. use of muladd to compute a*b + c operations in a single operation.

It’s definitely some work but if you can identify which functions are sped up by --math-mode=fast then you can speed them up manually. It’s probably just a handful of functions that are making most of the difference. My guess is that there’s a few loops that need @simd annotations in order to vectorize.