Julia messes up integer exponents?

DNF · February 14, 2019, 8:49pm

It sounds like a reasonable solution. However, this means that integer overflow is now inconsistent. In some operations it might overflow, in others not. So this won’t be correct any longer:

BLI · February 14, 2019, 8:57pm

There are strange things like this in most languages until you learn how to use the language. I tended to use tf as symbol for final time in MATLAB simulations/ode solvers, and got the weirdest errors. I wasted some time until I figured out that because I had the Control Toolbox installed, MATLAB converted my intended floating point quantity tf into a transfer function object.

As you have discovered, 6.023*10^23 in Julia means Floating point number 6.023 multiplied by integer 10^23 (because, as in all languages, power takes precedence over multiplication). Then 10^23 is too big to be represented correctly as a normal integer, and is wrapped down to another integer, in a way similar to taking modulus of numbers:

julia> mod(2,3)
2
julia> mod(3,3)
0
julia> mod(4,3)
1

isaacsas · February 14, 2019, 10:12pm

It was perhaps mentioned earlier, but if using 6.023e23 is hard to remember you could just always make sure to be consistent in using floats for anything but the exponent.

julia> 6.023 * 10.0^23
6.022999999999999e23

foobar_lv2 · February 14, 2019, 11:31pm

The very mild guardrail would only concern expressions of literals. 10^(23) and 10^(23+0) are already handled in widely divergent ways (look at Meta.lower(Main, :(10^23))), and would not cause any different runtime behavior. It would only cause some existing code to produce compile errors.

Keno’s much more radical idea is a very nice compromise: People who want to think about math integers get their nice ^ that protects them from themselves, and people who want to think in modular arithmetic need to use some Base.modular_pow.

StefanKarpinski · February 15, 2019, 12:14am

Did you read the entire post? It concludes that it should only cause a slowdown of a few percent and in isolation it does, but in realistic usage overflow checking severely interferes with compiler optimizations without which code becomes many times slower, not just a few percent.

StefanKarpinski · February 15, 2019, 12:16am

Note, however, that this would be a breaking change. It’s perfectly possible that someone is relying on integer exponentiation operating in ℤ^64 currently and making that an error would break their code.

Jean_Michel · February 15, 2019, 12:24am

Why then does one find code such as

for (op,chop) in ((:+,:checked_add), (:-,:checked_sub),
                  (:rem,:rem), (:mod,:mod))
    @eval begin
        function ($op)(x::Rational, y::Rational)
            xd, yd = divgcd(x.den, y.den)
            Rational(($chop)(checked_mul(x.num,yd), checked_mul(y.num,xd)), checked_mul(x.den,yd))
        end
    end
end

function *(x::Rational, y::Rational)
    xn,yd = divgcd(x.num,y.den)
    xd,yn = divgcd(x.den,y.num)
    checked_mul(xn,yn) // checked_mul(xd,yd)
end

in Base? Does not the use of checked_mul

jkbest2 · February 15, 2019, 12:34am

Because those operate on Rationals, not Ints?

Jean_Michel · February 15, 2019, 12:35am

And so what?

jkbest2 · February 15, 2019, 12:39am

Rationals have a fairly limited use case, where (as I understand it) a lot of Base depends on Ints for things like array indexing, and typically in cases where speed is a priority.

spaceLem · February 15, 2019, 12:46am

I suppose one could argue that in bounds checking, the default is safety, and you add @inbounds to say “I promise I’m not going to go out of bounds, compiler do your thing”, and so you could also have overflow checking as standard, and say @inlimits to turn off checking.

That said, given that I’m already expected to know what values my types support before I use them, and the sheer rarity of overflow errors that I’ve encountered, I’d really rather not have to litter my code with assurances that I’m being safe just to achieve decent performance.

Sorry, here I think performance clearly wins over safety. It’s not like the Julia documentation isn’t extremely up-front about how integers work, and if you really need that safety, it’s there if you want it.

Jean_Michel · February 15, 2019, 12:53am

Well, if Rationals are guaranteed against overflow, this is worth documenting in the manual, certainly.
If not, I do not really understand the design.

StefanKarpinski · February 15, 2019, 1:20am

Because rational arithmetic is

Already excruciatingly slow, and
Prone to overflow even with small values.

So there’s little downside and significant upside to checking for overflow. Using integers directly doesn’t overflow except in situations like this where someone should have been using floats or some other high range type. Also, rational arithmetic is not fundamental to anything. If rational arithmetic got ten times slower most people wouldn’t notice. Integer arithmetic is fundamental to everything. If integer arithmetic got even 10% slower, the villagers would be out with torches and pitchforks in no time. Let alone the disaster that would happen if we made all integer arithmetic checked and LLVM stopped being able to optimize anything.

Jean_Michel · February 15, 2019, 1:24am

I would notice for one. If the Julia users stay applied physicists and statisticians, they will not care. But I would like to encourage mathematicians to use Julia, provided that Julia welcomes them. Making Rationals and BigInts fast is a necessary step on the way …

And by the way, can you confirm whether Rationals are guaranteed to be checked against overflow?

StefanKarpinski · February 15, 2019, 1:53am

You’re missing my point. Only people who explicitly depend on rational arithemtic depend on the rational type. Everything involves integer arithmetic. It’s almost impossible to write a non-trivial function that doesn’t use it. If it got slower the entire language would be much slower. It would become impossible to write fast code in Julia, which is unacceptable.

foobar_lv2 · February 15, 2019, 2:30am

Re Rational arithmetic:

I think heavy users of symbolic math should be using Nemo.jl anyways. Having unoptimized Rational and BigInt and BigFloat types in Base is more of a nice and very convenient gimmick and not a core language feature; nemo does it better anyways.

Regarding overflow checking: The ring Z mod 2^64 is a really nice and natural object. It can often be used to approximate small integers, but is easy to reason about on its own right.

The construction of Rational{Int64} without any overflow checking… could you write down an elegant mathematical description? I don’t think addition and multiplication are even associative in that thing. Who would care for that or be able to reason about that object? But you are right that overflow-checking in Rational should be documented.

Is there any reason to use Base Rational and BigInt instead of the very cool nemo/flint constructions? But apart from that, I agree: BigInt is really annoying, especially because proper syntactic sugar for in-place operations is missing. Modelling BigInt as immutable/value-type is deadly for performance. To make it fast, people must mentally treat each BigInt like an Array and carefully reuse temporaries instead of allocating in loops, using the Base.GMP.MPZ interface. The solution for Array was dot-syntax, broadcasting and views that make mutating code almost as short and simple as non-mutating code.

Syx_Pek · February 15, 2019, 2:38am

I believe on the reasons why we don’t use Nemo/Flint is due to large size of the libraries. Adding it as dependencies could prove annoying. I believe this was the justification for ArbFloats.jl

JeffreySarnoff · February 15, 2019, 3:37am

ArbFloats and ArbNumerics rely on the Arb C library (and that library is also incorporated in Nemo). The Arb C library requires Flint and GMP and MPFR.

ArbFloats was written so the community would have access to the Arb library numerics in a more Julia ready manner and with care taken to allow interoperability without forcing a style of coding and function utilization distant from that of Julia.

ArbNumerics is a more recent take on that, without some of the extreme care for always showing extended precision numerical values in their most informative precision that is also the least misleading value-specific digit sequence. ArbNumerics includes support for Arb’s complex type and provides more math functions than ArbFloats.

Tamas_Papp · February 15, 2019, 7:37am

If all calculations are wildly wrong, I suppose this should be caught by your unit tests then.

MrRobot · February 15, 2019, 7:43am

In an ideal world maybe. Testing algorithms is necessary no matter what. I don’t need a hand meeting my quota for mistakes per algorithm @.@ But I am so far removed from the last 30 replies, quoting me won’t add much