Division by zero runs without warning -> complicates finding bugs

Heyy :slight_smile:

Not sure if this is the right place to post this.

Also not sure if this is a valid complaint, that’s why I’m posting here to get other peoples opinion.

This post is coming from a place of frustation, as I just spent 4 hours trying to find a bug in the calculation of errors for a large dataset of random variables. Turns out I have a weird edge case which causes the sample size for multiple of these variables to be just one, resulting in the calculation of the sample variance to include a sneaky division by zero, which results in the errors being infinitely large, which gets plotted as ‘no error at all’.

I am currently using Julia for my thesis, with a background in C/C++ and Python.

With my previous programming experiences, ‘Division by Zero’ would usually be throwing some kind of error or warning.

In Julia, this seems to not be the case.

Compare the output of this code in Python:

a = 100
b = 0
c= a/b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

to the output in Julia:

a = 100
b = 0
c = a/b
Inf

For me, personally, this is really confusing and unexpected behaviour. I would expect at least a warning to be thrown.

How do you feel about this? Am I missing a very good reason for this to be the default behaviour?

3 Likes

/ in Julia is floating-point division (unlike C/C++), and the IEEE 754 standard says that floating-point division of a nonzero (and non-nan) ±value by +0 gives ±Inf. (Though it would be nice to have a way to trap fp exceptions (julia issue #27705).)

In C/C++, you will also get Inf with floating-point division like 100/0.0. It’s just that 100/0 in C/C++ is truncated integer division (3/2 == 1 in C/C++), not floating-point division, analogous to Julia’s div:

julia> div(100,0)
ERROR: DivideError: integer division error

which also throws an error.

Even in Python, this produces inf with NumPy:

>>> np.int64(100)/0
inf

I’m guessing that the reason 100/0 doesn’t produce inf in Python is a legacy of Python2, where integer / integer was truncated integer division like C/C++, not floating-point division.

In other languages where / is floating-point division, the examples I can find also typically seem to follow the IEEE 754 behavior of producing Inf from division by 0, similar to Julia. For example, R, Javascript, Matlab …

23 Likes

Seems like the really objectionable thing here is plotting infinite error as if it was zero error…

10 Likes

Your Python example is a bit suspect, because if you’re indeed dealing with “a large dataset of random variables” then unlikely that integer a = 100 would be a realistic numerator. Most people do computations with numpy. Expanding on @stevengj, Python can be inconsistent:

>>> np.Float64(100) / np.Float64(0) # no error
inf
>>> np.Float64(100) / 0 # no error
inf
>>> 100.0 / 0.0 # error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: float division by zero
>>> 100.0 / 0 # error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

How were you plotting this? Even with matplotlib.pyplot.errorbar, an infinite standard deviation gets plotted as ‘no error at all’. You wouldn’t have been saved with numpy and matplotlib.

Nevertheless, @StefanKarpinski makes a good point, perhaps plotting packages should show infinite error bars as something explicitly weird, since there is already NaN for no error bar.

1 Like

Note that in Julia we also have div or ÷ for Euclidean integer division.

julia> 100÷3
33

julia> 100÷0
ERROR: DivideError: integer division error
Stacktrace:
 [1] div(x::Int64, y::Int64)
   @ Base ./int.jl:295
 [2] top-level scope
   @ REPL[33]:1

julia> Base.checked_div(5,0)
ERROR: DivideError: integer division error
Stacktrace:
 [1] div
   @ Base ./int.jl:295 [inlined]
 [2] checked_div(x::Int64, y::Int64)
   @ Base.Checked ./checked.jl:323
 [3] top-level scope
   @ REPL[34]:1

3 Likes

At least we can track them

4 Likes

I find it quite disturbing that this is inconsistent within Python. Numpy is the de-facto numerical tool for Python, so this looks like a gigantic footgun…

1 Like

The behavior is configurable with numpy.seterr.

In mine, without configuring anything that I remember, I get a warning for np.float64(1)/0.

2 Likes

Thanks, but it doesn’t really help much, it’s a footgun you don’t know about until well after it goes off. And anyway it’s the default numpy behaviour that is correct/standard, while the python default behaviour isn’t.

3 Likes

I checked 754 and it does indeed say

A function that has a simple pole for some finite floating-point operand shall signal the divideByZero exception and return an infinity by default.

If the behavior you’re concerned about is normal Python division throwing ZeroDivisionError, it seems like the user would know about that immediately, no?

2 Likes

They wouldn’t know about it before it happens. I’m just saying it’s strange that this is not consistent between regular python and numpy. Just like the OP, users will get used to getting an error, then be mystified by Inf values when using numpy/Julia/Matlab/C/R/etc…

4 Likes

I can’t decide whether Python default math was a disaster or genius move. Infinite integer precision seems like a nice convenience, but certainly led to wrong expectations for floats, numpy, Julia, Matlab, etc. And convenient for what, aside from a few Project Euler problems, where big numbers may be helpful but not when they’re slow? Maybe convenient when someone wants to casually type 2**100 into the terminal and not be surprised?

Lack of a proper numeric array made Python pretty bad for math until numpy sorted out numeric vs numarray, although still somewhat annoying today to have to convert between arrays. But nowadays people are accustomed to a bunch of different array types, and one can expect to transparently do math with pytorch or jax or polars and it kind of works the same, and few complain anymore when it doesn’t. It’s really extraordinary how far it’s come since numpy, and mostly open source.

(BTW I don’t mean to complain about OP’s post, which is a legitimate question, and helped some of us point the finger at how plotting packages handle inf.)

4 Likes

It’s a valid concern, still allowed by the IEEE standard, by default, that Julia follows.

I think it makes sense to allow:

julia> Base.checked_div(100.0, 0.0)  # i.e. define for floating point too, any combination of such types; and integers
ERROR: MethodError: no method matching checked_div(::Float64, ::Float64)

It wouldn’t help with the foot-gun, but there’s an old draft PR available to allow regular division to check for exceptions and it would if enabled, i.e. seemingly in effect would use the above implicitly (meaning do a divide, return the finite result, OR throw an exception):

[In effect we would have that, and we actually still have that option since we can call NumPy to do calculations, with e.g. PythonCall.jl.]

I think Julia debug mode should set such (also possibly with an ENV var), plus the option of explicitly set in Julia code, overriding the other options.

Speaking of, for integers (div is way slower than it could be with; types such as Rust’s, but that shifts the burden to other operators… fully or only partially?):

@code_native Base.checked_div(100, 0)  # gets be exact same code as for div or 100

Rust has some intriguing types such as simply disallowing zero (or also positive) to prevent division by zero:

NonZeroU128 is guaranteed to have the same layout and bit validity as u128 with the exception that 0 is not a valid instance. Option<NonZeroU128> is guaranteed to be compatible with u128, including in FFI.

[It’s wasteful to have a sign bit for floating point, and also to allow zero, especially for Float8 (now standardized), and FP4 and FP6, and while was useful for neural networks (those types now outdated already), I think Float8 types are like a happy medium for measurements (in my system only for inputs, Float64 basically for calculations), in my type I’m designing, effectively similar to otherwise 10-bit, since you’ve never measured anything negative, nor zero.]

The types for NonZero (in Rust, and my type) are opt-in, so not preventing the foot-gun.

No?! It’s such a division, unless you divide by zero, then undefined behavior (or at least implementation defined), just as it’s undefined in math, just worse in C/C++:

The Python behavior is costly (without the mentioned types or compiler support), but it is better for debugging, and this is just one more reason I can think of that Julia should have a debug mode, where division by zero is impossible, i.e. to get an Inf (or NaN actually for 0.0 / 0.0) back. That seems same enough to prevents errors, but actually those are wrong;

julia> (0 / 0) ^ 0.0  # also wrong: 1.0 ^ (1 / 0)
1.0
  divrem(x, y, r::RoundingMode=RoundToZero)

  The quotient and remainder from Euclidean division. Equivalent to (div(x, y, r), rem(x, y, r)). Equivalently, with the default value of r, this call is equivalent to (x ÷ y, x % y).

julia> Base.divrem(100, 0)
ERROR: DivideError: integer division error

does it rather make sense to return (maybe with optional keyword argument?):
julia> Base.divrem(100, 1)
(0, 100)

Its bignum default has I think nothing to do with it checking for division by zero per se, that comes from Python 2, when bignum was I think not used. It does yes, set an unrealistic standard for Julia and NumPy and others, because bignums are slow, and we want to be fast by default.

[For ^ I think Julia made a mistake, another foot-gun, for returned datatype, since then, and only then bignums or at least overflow detection or simply returning Float64 would have helped. In other cases bignums are an overkill, be default, costly in Julia, though less costly in Python…, but more costly than Julia’s (machine) integers.]

3 Likes

Hey everyone!

Thank you so much for all the super helpful and interesting answers. Turns out I know way less about div operations than I thought I did :smiley:

I have a lot of reading up to do and I appreciate the discourse this has spawned.

I will let my thread stand so maybe other people that have the same question can find it in the future.

Thank you so much <3

11 Likes

We have to clear up some misinformation here. For me, base python warns every time, and numpy warns only the first time in a session:

❯ python
Python 3.12.1 | packaged by conda-forge | (main, Dec 23 2023, 08:03:24) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1/0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
>>> 1.0/0.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: float division by zero
>>> import numpy as np
>>> np.int64(1)/np.int64(0)
<stdin>:1: RuntimeWarning: divide by zero encountered in scalar divide
inf
>>> np.int64(1)/np.int64(0)
inf
>>> np.float64(1)/np.float64(0)
inf
>>> 
❯ python
Python 3.12.1 | packaged by conda-forge | (main, Dec 23 2023, 08:03:24) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.float64(1)/np.float64(0)
<stdin>:1: RuntimeWarning: divide by zero encountered in scalar divide
inf
>>> np.float64(1)/np.float64(0)
inf

❯ python
Python 3.12.1 | packaged by conda-forge | (main, Dec 23 2023, 08:03:24) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.float64(1)/0
<stdin>:1: RuntimeWarning: divide by zero encountered in scalar divide
inf
>>> np.float64(1)/0
inf

So both the examples above:

>>> np.Float64(100) / np.Float64(0) # no error
inf
>>> np.Float64(100) / 0 # no error
inf

do first of all not run because np.Float64 is not defined:

>>> np.Float64(100) / np.Float64(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dennis/miniconda3/envs/MasterProject/lib/python3.12/site-packages/numpy/__init__.py", line 333, in __getattr__
    raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'Float64'. Did you mean: 'float64'?

and secondly do print a warning (the first time in a session):

>>> np.float64(100) / np.float64(0)
<stdin>:1: RuntimeWarning: divide by zero encountered in scalar divide
inf

All in all I find this behaviour much preferrable to Julia’s silent return of Inf. Furthermore, if it is indeed true that the 754 standard states that

, I would say that Julia in fact violates the standard!

The clause you curiously omitted from your quote, “and return an infinity by default”, makes it clear that IEEE-754’s signalling of exceptions is not the same as throwing exceptions on the level of Julia or other programming languages. Neither meanings of “exception” are the same as NumPy’s printed warnings, as delineated by numpy.seterr. Julia lacks an equivalent, and the relevant issue #27705 was linked earlier.

The real reason that Python “unusually” throws an error on floating point division by zero is that it was created when the IEEE 754 standard wasn’t close to universal, and nobody attempted the massive breaking change to fall in line. Until v3.11, IEEE-754 floating points weren’t even required to build Python, and floatobject.c still checks if the hardware is IEEE-754 compliant. The various standards weren’t necessarily far from each other, and Python is still close. v3.11 did change math.pow to be compliant with IEEE-754 in the case of ±0 ^ -inf where an output of +inf is demanded with no signalled exceptions.

Aside: base Python does silently return inf if you divide by an almost zero. On v3.12.0:

>>> 1.0/1e-308
1e+308
>>> 1.0/1e-309
inf

In practice, any value very close to 0 or infinity needs to be checked, it doesn’t really matter if it’s exactly 0.0 or inf.

4 Likes

For those who are curious, the entire line from the spec is:

divideByZero: A function that has a simple pole for some finite floating-point operand shall signal the divideByZero exception and return an infinity by default.

IEEE 754 has half-baked support for hardware exceptions like signaling NaNs and divide by zero. When the spec was initially published the interested parties couldn’t agree on whether to have hardware exceptions. IIRC, the numerical analysis people wanted exceptions because it’s nice to know when you get an infinity or a NaN, whereas the hardware manufacturers were against exceptions because expensive to implement and slows things down. So we got a worst of both worlds compromise where there’s two kinds of NaN—quiet and signaling—with support for signaling NaNs being optional. As a result, the support for exceptions is half-assed at best, and no compiler toolchains actually support them well, because the hardware designs are so bad (global CPU flags :grimacing:); since the hardware and tool support is bad, languages don’t generally even try to use hardware exceptions, which means there’s no demand for the tool chain support to get better.

Which leaves us in the current state of affairs: hardware support for floating-point exceptions is so bad as to be effectively unusable from a high-level language without slowing code down. But as it turns out, hardware exceptions are pretty incompatible with SIMD vectorization anyway as well as being bad for other forms of concurrency like multithreading—having to constantly guard against code throwing an exception and throwing control elsewhere is bad but it’s even worse when there’s multiple concurrent threads of control. CPUs have bad support for hardware exceptions, but GPUs don’t support them at all. And that’s generally the direction things are moving: away from hardware floating-point exceptions, not towards them. Yes, getting a “divide by zero” error is nice from a user perspective, but it’s not something the IEEE 754 spec mandates (although it does allow it), it’s also not something the compiler tooling supports well, and it’s increasingly not something that hardware supports.

14 Likes

Oh, and in case it’s unclear why hardware exceptions are relevant, the alternative is emitting actual instructions after every single floating-point operation to check if it produced an infinity or a NaN and then explicitly throwing an exception in software if they did. This would have absolutely horrible performance implications… lots of extra instructions, potential exceptions thrown everywhere, blocking vectorization everywhere. Hardware exceptions avoid the explicit instructions, but still have the other drawbacks, namely that every floating-point operation becomes a potential exception point and no functions that do any floating-point operations can be truly pure. Basically, the anti-exception camp was right—having floating-point operations throw is a bad idea.

7 Likes

Thanks for both replies, that is insightful.

But when you do have a problem, you go back and look for it, at which point performance is not so important. Fortunately, with Julia you can usually plug in a different type that acts just like Float64 or whatever, except that it checks for bad conditions. @brianguenter suggested a NaNCheck type here, which I’ve used to track down NaNs in my code.(*) I keep meaning to put that together as a little package, because there have been a few times when this issue has crept up.

Along the same lines is SaferIntegers.jl, which may be more specifically relevant to the OP.


(*) I’ve also found it useful to initialize arrays with NaNCheck{Float64}(NaN), so that I can find when I used uninitialized data.

2 Likes