I was surprised to see results that are significantly different (by much more than I would except when using normal ffast-math optimizations) when running julia with --math-mode=fast.
It seems like a very rough approximation is being used for exp() ( 0.1% error for exp(1.0)). Is this documented anywhere? And why don’t I see the same behavior with @fastmath or when calling Base.Math.exp_fast?
The beginning of the code for exp looks something like:
const a = 369.3299304675746
const b = 6.755399441055744e15
function f(x::Float64)
z = muladd(x, a, b)
z -= b
end
Without --math-mode=fast this gives the result 369.0, with it it gives 369.3299304675746. Looking at the generated code, in fast mode, it has simplified this code to x*a:
julia> @code_llvm f(1.0)
...
%1 = fmul fast double %0, 0x40771547652B82FE
ret double %1
}
That seems unfortunate. Is there any way to add something like @nofastmath to julia’s implementation of exp()?
I see a similar constant and muladd in the glibc implementation, but that generally won’t be compiled with -ffast-math. Also for some reason, julia’s implementation is 24% slower than glibc’s, even with the muladd eliding.
I mean, the whole point of fast-math is trading off speed with correctness. If fast-math was to give always the correct results, it wouldn’t be fast-math, it would be the standard way of doing math.
Right, but generally when I use fast-math, I can reason about my own code and conclude that doing something like swapping the order of some operations will have a minimal impact on my results.
But here we have a standard library function that’s severely affected to the point that it’s unusable for many purposes. And many people who aren’t familiar with julia’s internals may assume that it’s calling glibc math functions, as most languages do.
These errors are orders of magnitudes worse than what one would get when doing other common operations with numbers that are in a reasonable range:
Oh, that actually should be fixable. Rewriting that to unsafe_trunc should retain almost all the performance and remove the ability of fastmath to mess it up.
More generally, however, I don’t think @fastmath should be applied to inlined code from other functions. There are just way too many pitfalls there.
Basically, @fastmath is used by people who think “this code is insensitive to rearrangements that are exact in exact arithmentic (e.g. re-association)”, but it is not reasonable to ask anyone to make that judgement for code they can’t even see (because it is inlined from somewhere else).
Update: ~~filed Issues · JuliaLang/julia · GitHub Sorry, my misunderstanding — it seems that @fastmath already doesn’t apply to inlined code.
Yes, fast math doesn’t apply to inlined code (which also makes it more annoying to use).
I think the general agreement is that math-mode=fast is currently massively broken, as this thread shows, and should never be used. I would imagine the view of core devs is “why is this horrid thing still here,let’s kill it”, which is a pity because it could actually be useful with some care. For that to happen , we need a @nofastmath macro and apply it judiciously to functions that do floating point trickery (eg special summations, implementations of special functions, etc). If this is done, it should be relatively OK (in the sense that all bets are still off and arbitrary loss of precision is always possible, but it has a pretty good chance of not losing too many digits in practice)? One thing I’m not sure about is how that interacts with precompilation, though.
The whole point of @fastmath is to use floating point trickery for more speed. It doesn’t even apply to non-floating point operations. One consequence of that trickery is arbitrary loss of precision, which is what you can’t avoid when you assume floating point ops are associative, not matter how little you apply it. Really, modes like -Ofast in Fortran or C for floating point ops were a well-meant feature that has horrid consequences when used in a language like julia, where code almost always gets compiled wholesale through all calls, i.e. without calling already-compiled code from a dynamically linked library.
Yes, but I’m talking about gcc flags like -Ofast, which only affect the current compilation unit and don’t affect linked library code (which is already compiled, unlike the vast majority of linked julia code - here it’s far more common to compile the code of dependencies as well), as far as I’m aware.
So I’d argue that the library not resetting the flag was faulty behavior and a library compiled with -Ofast shouldn’t affect the caller either way (especially not influence how julia compiles its own code, which is the topic here, instead of getting incorrect results from correctly compiled code, as in that issue)