What's going on with exp() and --math-mode=fast?

I was surprised to see results that are significantly different (by much more than I would except when using normal ffast-math optimizations) when running julia with --math-mode=fast.

It seems like a very rough approximation is being used for exp() ( 0.1% error for exp(1.0)). Is this documented anywhere? And why don’t I see the same behavior with @fastmath or when calling Base.Math.exp_fast?

$ ~/julia/bin/julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.0-DEV.178 (2021-07-14)
 _/ |\__'_|_|_|\__'_|  |  Commit 696cb1ba7b (0 days old master)
|__/                   |

julia> exp(1.0)
2.718281828459045

julia> @fastmath exp(1.0)
2.718281828459045

julia> Base.Math.exp_fast(1.0)
2.718281828459045

$ ~/julia/bin/julia --math-mode=fast
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.0-DEV.178 (2021-07-14)
 _/ |\__'_|_|_|\__'_|  |  Commit 696cb1ba7b (0 days old master)
|__/                   |

julia> exp(1.0)
2.7158546124258023

I don’t know, but try doing the @fastmath call first? Maybe the compiler knows the result should be the same and just reuses it?

This is old (before Julia 1.0), but I don’t know if the situation has changed. If not, that would explain the difference you observed:

1 Like

The beginning of the code for exp looks something like:

const a = 369.3299304675746
const b = 6.755399441055744e15

function f(x::Float64)
    z = muladd(x, a, b)
    z -= b
end

Without --math-mode=fast this gives the result 369.0, with it it gives 369.3299304675746. Looking at the generated code, in fast mode, it has simplified this code to x*a:

julia> @code_llvm f(1.0)
...
  %1 = fmul fast double %0, 0x40771547652B82FE
  ret double %1
}

while it otherwise computes things as given:

julia> @code_llvm f(1.0)
...
   %1 = fmul contract double %0, 0x40771547652B82FE
   %2 = fadd contract double %1, 0x4338000000000000
   %3 = fadd double %2, 0xC338000000000000
  ret double %3
}

There can be an arbitrary error in the final result from such a change.

1 Like

That seems unfortunate. Is there any way to add something like @nofastmath to julia’s implementation of exp()?

I see a similar constant and muladd in the glibc implementation, but that generally won’t be compiled with -ffast-math. Also for some reason, julia’s implementation is 24% slower than glibc’s, even with the muladd eliding.

Why are we allowing changing a muladd to a mul? That just seems totally broken.

I mean, the whole point of fast-math is trading off speed with correctness. If fast-math was to give always the correct results, it wouldn’t be fast-math, it would be the standard way of doing math.

9 Likes

Right, but generally when I use fast-math, I can reason about my own code and conclude that doing something like swapping the order of some operations will have a minimal impact on my results.

But here we have a standard library function that’s severely affected to the point that it’s unusable for many purposes. And many people who aren’t familiar with julia’s internals may assume that it’s calling glibc math functions, as most languages do.

These errors are orders of magnitudes worse than what one would get when doing other common operations with numbers that are in a reasonable range:

julia> log(exp(1.0))
0.9991066782289837

julia> log(exp(1e-3))
0.0
1 Like

It simplifies

z = x * a + b # z = muladd(x, a, b)
z -= b

to

z = x * a
3 Likes

Oh, that actually should be fixable. Rewriting that to unsafe_trunc should retain almost all the performance and remove the ability of fastmath to mess it up.

3 Likes

More generally, however, I don’t think @fastmath should be applied to inlined code from other functions. There are just way too many pitfalls there.

Basically, @fastmath is used by people who think “this code is insensitive to rearrangements that are exact in exact arithmentic (e.g. re-association)”, but it is not reasonable to ask anyone to make that judgement for code they can’t even see (because it is inlined from somewhere else).

Update: filed don't apply `@fastmath` to inlined function bodies · Issue #41582 · JuliaLang/julia · GitHub Sorry, my misunderstanding — it seems that @fastmath already doesn’t apply to inlined code.

4 Likes

That’s why we should rename it to @unsafemath.

5 Likes

Yes, fast math doesn’t apply to inlined code (which also makes it more annoying to use).

I think the general agreement is that math-mode=fast is currently massively broken, as this thread shows, and should never be used. I would imagine the view of core devs is “why is this horrid thing still here,let’s kill it”, which is a pity because it could actually be useful with some care. For that to happen , we need a @nofastmath macro and apply it judiciously to functions that do floating point trickery (eg special summations, implementations of special functions, etc). If this is done, it should be relatively OK (in the sense that all bets are still off and arbitrary loss of precision is always possible, but it has a pretty good chance of not losing too many digits in practice)? One thing I’m not sure about is how that interacts with precompilation, though.

2 Likes

unsafe_trunc uses RoundToZero. It causes more numerical errors.

BTW, even if you are not using fastmath, LLVM may replace a muladd with a mul.

The whole point of @fastmath is to use floating point trickery for more speed. It doesn’t even apply to non-floating point operations. One consequence of that trickery is arbitrary loss of precision, which is what you can’t avoid when you assume floating point ops are associative, not matter how little you apply it. Really, modes like -Ofast in Fortran or C for floating point ops were a well-meant feature that has horrid consequences when used in a language like julia, where code almost always gets compiled wholesale through all calls, i.e. without calling already-compiled code from a dynamically linked library.

2 Likes

Options like -Ofast can affect global states. That can even happen upon loading a shared library and without actually calling any functions in it: benchmarking ("exponent", "subnorm", "Float32") produced the DomainError when ApproxFun package is also used · Issue #253 · JuliaCI/BaseBenchmarks.jl · GitHub

Also note that setting the math-mode=fast as a startup flag to Julia is much more invasive than locally using fastmath. julia/intrinsics.cpp at 23dbf169f0e4b831adfd20d666bedd1f669527ca · JuliaLang/julia · GitHub

1 Like

Yes, but I’m talking about gcc flags like -Ofast, which only affect the current compilation unit and don’t affect linked library code (which is already compiled, unlike the vast majority of linked julia code - here it’s far more common to compile the code of dependencies as well), as far as I’m aware.

Me too :slightly_smiling_face:

Did you enjoy reading the whole debugging trip in the bug report I linked? :slightly_smiling_face:

Yes, I’ve read that thread before - I have the (sometimes useless) ability to read way too many interesting issues :slight_smile:

Let me quote you right back at yourself though:

From Denormal number - Wikipedia

well-behaved software should save and restore the denormal mode before returning to the caller or calling out to unsuspecting library/OS code

From here.

So I’d argue that the library not resetting the flag was faulty behavior and a library compiled with -Ofast shouldn’t affect the caller either way :stuck_out_tongue: (especially not influence how julia compiles its own code, which is the topic here, instead of getting incorrect results from correctly compiled code, as in that issue)