How to tell VS Code to use the fast math option?

But it’s not just stdlib code—any usage of floating point that depends on IEEE standard behavior (intentionally or not) in any package or project then would need to defensively use @nofastmath just to guarantee that some caller who has used this global flag doesn’t break their code. That seems entirely perverse. Worse still, since there’s no standard for fast math mode, LLVM could change what it does at any point and there are no guarantees that code that worked with one version of LLVM will continue to work in the future.

In the other direction, things are much saner: code by default can rely on IEEE standard behavior, which, since it is standardized, cannot change willy nilly when you upgrade something—the standard is respected by both compilers and hardware. In cases where some high-performance code needs to indicate that it wants to bend a particular rule, we have ways to let the author indicate that in a well-scoped way. You get both speed and sanity. This is how we’ve been doing things for years now and it works well.

Bottom line: --math-mode=fast should probably be deprecated or at least print a dire warning when you start Julia. The @fastmath macro is fine because it is limited in scope to the code where you explicitly use it.

2 Likes

Of course what you’re saying is correct, and it’s infinitely simpler from the perspective of Base to just remove --math-mode=fast. However, this means that to get fastmath improvements you need to decide whether you want it when writing the code, not when using it. If I’m writing a library for optimization (say), should I use @fastmath to get faster but non-IEEE compliant code? I shouldn’t have to decide that, the user should. --math-mode=fast allows the user to make that choice globally, like people do in usual compiled languages. Is it pretty? No. Does it work? In theory, no. In practice, mostly, except for a few things that look like they can be taken care of by adding two or three annotations.

Here’s a motivating example:

using BenchmarkTools
function mydot(x,y)
    s = 0.0
    @inbounds for i = 1:length(x)
        s += x[i]*y[i]
    end
    s
end
@fastmath function mydot_fastmath(x,y)
    s = 0.0
    @inbounds for i = 1:length(x)
        s += x[i]*y[i]
    end
    s
end
@fastmath function user_dot(x,y)
    mydot(x,y)
end

x = randn(100000)
y = randn(100000)
@btime mydot($x,$y)
@btime mydot_fastmath($x,$y)
@btime user_dot($x,$y)

user_dot is only fast under --math-mode=fast. To make it fast otherwise without the flag, I have to modify mydot (which might be library code I don’t want to touch).

Bottom line: not that much code needs to be IEEE-correct (although @kristoffer.carlsson does have a point with NaN handling). A lot of code should be fast. Making that fast without requiring library code to decide whether they should do fastmath or not is definitely a hack that has and will cause bugs to people using --math-mode=fast, but I’m not sure I see the point in forbidding that. Why not just add a @nofastmath macro, document clearly that people that use --math-mode=fast are on their own and should not expect Julia devs to fix the bugs that will inevitably appear, and let them submit PRs adding @nofastmath to Base and libraries if they find code that’s behaving strangely under fastmath?

No, 100% disagree. Only the library author is in a position to know if their algorithm can allow reassociation and other non-IEEE simplifications safely or not. If it can, they can either do the transformations manually or use the appropriate flag to give the compiler permission. The user is in no position to have any idea if the code they are calling is safe to non-IEEE transformations on.

Your example only shows that the author of mydot probably wants a @simd annotation. The fact that the user can’t override that is not a problem, it’s a good thing. If the user notices this and wants mydot to be faster, they can dev and edit the package and make a PR, which, if it’s considered correct by someone who understands the intention of the code, will be merged and then everyone benefits instead of just those people who are willing to risk getting total garbage answers for the sake of code running a few percent faster.

2 Likes
 ________________________________________
/ You have enabled fast math             \
| optimizations globally. If you know    |
| what you are doing, you probably would |
\ not have done it.                      /
 ----------------------------------------
        \   ^__^
         \  (@@)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
15 Likes

I’m not sure I agree, because users typically use a debug/release workflow. On numerical code I would prototype the algorithm, in which case I want to minimize sources of confusions such as non-IEEE arithmetic and NaN handling, then when I’m convinced it works correctly run it on a larger case, where I don’t care about correct NaN handling (because I’ve made sure that I don’t produce NaNs), and where I would like to be able to run a few percents faster. People have been doing that for centuries in compiled language (you debug with -O0, and run with -O3 -ffast-math). Sane library authors, forced to choose between nothing and @fastmath module MyLib (because, let’s be honest, nobody is going to bother writing and maintaining more fine-grained annotations), potentially risking tricky bugs and bad error reporting if a user function somewhere returns a NaN or something, will err on the side of caution and never enable fastmath, and as a result speed improvements will be left on the table. I’m not saying that a global toggle is the best way to achieve this (possibly a debug/release mechanism would be appropriate?), but at least it’s there.

Here’s the thing though: compiled languages with separate compilation allow for compiling some library without fast math and others with it. That lets you use fast math for just your code while calling libm, FFTW or whatever compiled with IEEE math. In Julia, that’s not how things work since we don’t do separate compilation and generally can’t since we regularly inline code from one place into another. It might be possible to make it work but it would be a lot of work for a pretty questionable feature. Also note that it would not give you what you want anyway, which is being able to force the compilation of library functions with fast math. Of course, you also probably are not getting this in C or FORTRAN.

1 Like

Brilliant!

This strikes me as a wildly bad idea if you care about your code working correctly. Do you at least have a comprehensive test suite and run it in -O3 -ffast-math mode to see if anything breaks?

Sane library authors, forced to choose between nothing and @fastmath module MyLib (because, let’s be honest, nobody is going to bother writing and maintaining more fine-grained annotations), potentially risking tricky bugs and bad error reporting if a user function somewhere returns a NaN or something, will err on the side of caution and never enable fastmath, and as a result speed improvements will be left on the table.

Except that’s not the choice. Julia gives you all the targeted compiler hints you need to manually accomplish what fast math does without giving up on predictability and correctness. Package authors can and do use explicit algebraic somplification, @simd, code generation macros, and the occasionally local @fastmath to make sure libraries run at optimal speed. All you’re accomplishing by turning on a global fast math flag is pulling out the rug from under those carefully tuned libraries, which often already live on the fine edge of correctness guaranteed by ieee. See all the examples on the Julia repo of global fast math mode breaking the already-well-optimized algorithms.

3 Likes

It seems to me (but I could be wrong) that there’s not that much code that breaks under fastmath. I just tried as an example running julia under --math-mode=fast and running the tests for Optim.jl (including the runtests.jl file directly), the failures seemed relatively minor (results differed by a small error and were counted as wrong).

I think we are working with a very different mental model of users. You seem to be thinking about large, established codes developed by people with intimate knowledge of IEEE standards. I’m talking about small to mid-sized codes (from month-long exploratory projects to that code that survives a grad student and gets passed around in a lab for a few years), that are designed by application scientists with hazy (at best) knowledge of the effects of floating point errors. Talking about “at least having a comprehensive test suite” in this context is a non-starter. That has never stopped anybody from including a -ffast-math in their makefiles, usually without any problems.

Again, whether a package needs @fastmath or not may depend not on the package intrinsically, but on its uses. When I use an optimization package to elaborate an algorithm, I want it to behave as nicely as possible (eg because I may mess up my objective function and return a NaN, which I want caught as fast as possible). Then when I do a “production” run, I’ve already made sure that no NaNs will be produced, and I just want it to be fast.

I probably won’t be able to convince you of the validity of this use-case. I do think the option should be left there (with a severe warning that pretty much nothing is guaranteed) and that people interested in using it should be allowed to try to fix problems arising from it in Base with @nofastmath to code that needs it. If in two years nobody has bothered to, then that’s the answer - nobody is using that feature and it can be removed. But removing it at this stage of the language development seems premature.

Sorry for the somewhat epistemological question, but how did they know without a test suite?

2 Likes

We could just leave it, but it’s a hard sell to implement the @nofastmath feature just to support a global flag that we probably shouldn’t have in the first place. I do think that users deserve some warning when they use an option that could quite possible render their computations into total nonsense in very unexpected ways. So maybe @Tamas_Papp’s banner should be added.

2 Likes

I must insist on the cow’s eyes being red.

4 Likes

The option --math-mode=fast is “dangerous” but what about “-O3”? Is it also dangerous to use it?

No. (also since SLP is included in O2 I haven’t seen any benchmark where O3 is better than the default)

1 Like

How can I know what option is being used from within my Julia session?
What optimization option is used if I launch Julia from VS Code?

Base.JLOptions().opt_level

2 Likes

OK, thanks, VS Code says it’s using -O 2.

That’s the default.

Based on this thread and others, I have a feeling you may be fixating a bit too much on compiler flags. Julia already defaults to good performance—you should not need to mess around with optimization flags unless you are doing something pretty unusual. Frankly, the --math-mode=fast flag is a bad idea and you shouldn’t use it at all (sorry, @antoine-levitt, but that’s my honest opinion) and -O3 is basically the same as -O2 which is already the default. These options are not some magic wand to make your code faster—even if they do, it will only be by a very small margin. If your code is not fast enough, instead of messing with compiler flags, you should: profile it, look at allocation stats, make sure it’s type stable, make sure you don’t use non-constant globals, and put @inline and @simd annotations where appropriate. Or better still, use better algorithms—that’s where the most dramatic speed improvements come from. In all my time doing Julia consulting work helping people speed up their code, I have never once changed any compiler flags.

4 Likes

FWIW, I almost completely agree: --math-mode=fast is the last resort of the last resort, after thinking about the problem, the algorithm, the structure of the code, the code itself, and then the micro optimizations, all that with copious profiling. It’s just that sometimes, you’ve done all that (or decided you’ve done enough and further improvements would be too much of a bother), you play around with flags for 5 minutes (either of the safe variety like O3 or of the frankly dangerous variety with math-mode), you’ve made sure it doesn’t significantly change the result and it gives you an extra 10% with no modification to your code. Seems a shame to let that go.