I’ve written a large numeric simulation that benefits substantially from starting Julia with mathmode=fast
(79% speedup). I know this flag can be dangerous, but up until now, everything has worked well.
I’d now like to add Bayesian Optimization to tune a few parameters. Unfortunately, the Gaussian Process MLE breaks when I use explicit parameter bounds with fastmath enabled. Performance is not important for the Gaussian Process – I just need it to work.
Is there a way to disable fastmath for only the call to the Bayesian Optimization function?
I’ve tried switching to explicit @fastmath
in the obvious parts of my code, but I still get a 43% speedup by using the global mathmode=fast
flag.
3 Likes
I don’t know of a way to do this easily. Looking at codegen (intrinsics.cpp), it seems that codegen just uses the global flags which are set when julia starts. And the @fastmath
macro is a strictly local transformation.
Perhaps the best you can do here is to start two julia instances; one to do the outer loop with Bayesian optimization (fastmath off) and another (or several) worker processes to run the simulation with fastmath on? This might be a good setup in any case, as with multiple cores/machines you’ll then be able to run several simulations in parallel to feed back into the Bayesian optimization.
1 Like
Explicit @fastmath
seems like by far the best option to me, if you could achieve about the same performance. I’m curious why you can’t… is it because mathmode=fast
applies to code within Base
and other packages that you can’t reach with @fastmath
?
1 Like
I’ll need to profile more carefully to see where the differences are. I haven’t tried @fastmath
with every function yet, so it’s possible I’ve missed some, or like you said, there’s some code in another package that is affected.
I will be using remote workers, so I may just do that: run a master without fastmath and have all the slaves use fastmath.
The problem with @fastmath
is that it doesn’t compose so it could be a pretty big burden to add it enough places, especially if using external libraries.
Would be interesting to profile with and without fastmath and see if there is something specific that sticks out.
2 Likes
Almost all optimizations that fastmath allows can be done manually by some combination of:
 explicit algebraic simplification of expressions,

@simd
annotations to allow floatingpoint reassociation across loop iterations,
 use of
muladd
to compute a*b + c
operations in a single operation.
It’s definitely some work but if you can identify which functions are sped up by mathmode=fast
then you can speed them up manually. It’s probably just a handful of functions that are making most of the difference. My guess is that there’s a few loops that need @simd
annotations in order to vectorize.
4 Likes