Julia messes up integer exponents?


#1

I was using avogadro’s number in a calculation and couldn’t figure out why the results were not matching my matlab or python code…

Turns out using integers as exponents gets you in trouble in Julia?They start off innocent but at higher powers, but really not that big for science (Avogadro’s number is very popular!!! as are many other constants…)

julia> 10^1
10
julia> 10^2
100
julia> 10^5
100000
julia> 10^15
1000000000000000
julia> 10^20
7766279631452241920
julia> 10^16
10000000000000000
julia> 16^17
0
julia> 10^17
100000000000000000
julia> 10^18
1000000000000000000
julia> 10^19
-8446744073709551616
julia> 10^19.0
1.0e19
julia> 16^17 (discovered this by pure typo.. who knows what other dragons lurk out here)
0
julia> 16^17.0
2.9514790517935283e20
julia>

When I started coding in Fortran, I was told that integer exponents were faster, hence why I use them. Perhaps it isn’t a concern these days, but I guarantee a lot of unsuspecting scientists/students/people will get hit by this. I think it would be madness to expect everyone to use decimal exponents…


#2

Julia uses actual machine integer arithmetic, which means that numbers can and do under- and overflow if you try to perform a computation outside of the range of values that can be represented by such an integer. See https://en.wikipedia.org/wiki/Integer_overflow for more general explanation.

If you actually need to deal with integers outside the range of a 64-bit Int, then you might want to use Julia’s built-in support for BigInt (which is much closer to the way integers behave in Python):

julia> big(10)^19
10000000000000000000

julia> big(10)^100
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Note that BigInts are substantially more expensive to work with than regular Ints (this is part of the reason why basic arithmetic in Python is slower than in Julia). If you want something that will be more efficient than a BigInt but without unexpected overflow or underflow, check out SaferIntegers.jl:

julia> using SaferIntegers

julia> SafeInt(10)^19
ERROR: OverflowError: 10^19
Stacktrace:
 [1] ^(::SafeInt64, ::SafeInt64) at /home/rdeits/.julia/packages/SaferIntegers/eMjmj/src/pow.jl:45
 [2] ^(::SafeInt64, ::Int64) at /home/rdeits/.julia/packages/SaferIntegers/eMjmj/src/pow.jl:71
 [3] macro expansion at ./none:0 [inlined]
 [4] literal_pow(::typeof(^), ::SafeInt64, ::Val{19}) at ./none:0
 [5] top-level scope at none:0

#3

Well shucks… I forsee myself forgetting the big() in the future… I scale things in my simulations so there aren’t big numbers, or little numbers, but in post processing I need to use “real” numbers. Fortunately these aren’t long scripts…

Thanks for the answer, I do understand the issue now. Speed over safety.


#4

You can still use integer exponents with Float bases,and this is probably what you want. For example, Avogadro’s number would be

6.02e23

This way you won’t run into integer overflow issues, though you still have to worry about floating point imprecision. This is probably what you were doing in Fortran.


#5

As @jkbest2 says, you should just use floating point numbers for this by using the constant 6.022140857e23 and you’ll never have this kind of overflow problem. The Avagadro constant is only known to ~8 significant figures anyway according to CODATA.

Alternatively, use the Unitful library, from which you have a handy value Unitful.Na already defined and in the right units.


#6

Avogadro’s constant is going to be redefined as exactly 6.02214076e23 mol−1 this year together with the redefinition of the SI units which will get rid of the need for the international prototype kilogram (stored in a basement in paris) :
https://iupac.org/new-definition-mole-arrived/


#7

Obviously too few computer scientists in IUPAC. A round number like 602214075999999987023872 would have been much better.


#8

In Matlab, all values are floating-point by default. Julia distinguishes between integer and floating-point types. To use a floating-point literal value do 10.0^100 or 1e100, for example — this is the right kind of arithmetic for scientific calculations on real numbers.

(Python distinguishes between floating-point and integer types, but uses bignum integer types by default as others have noted. Basically, working with bignum integers is only necessary on a 64-bit machine if you are doing number theory.)


#9

Actually, Matlab supports integer types too, but literals are always floats.


#10

Well, I think this thread will be useful to the science scientists when they try

julia> Na=6.023*10^23
1.2068671807961137e18

which makes no sense at all. Now they will know why… If you don’t know, it will drive you crazy for a few hours.


#11

I used to think about integer overflow this way but at some point I stopped. Now I think about Int64 as doing fully correct arithmetic in ℤ^64 with -2^63:2^63-1 as the set of representative elements for the modular equivalence classes. Once you make that mental shift, everything’s just lovely. The only question is whether ℤ^64 is a good domain to approximate the problem you’re working in.


#12

I have an iconoclastic suggestion (which could not be implemented until julia 2.0): that Int be an alias
for SafeInt64 instead of Int64. I use widely SafeInts, and in my experience they slow down computations usually by between 1% to 5% at most. I think this slowdown by a few percent is a price well worth paying
for the safety you (and more importantly, naive users) would get.


#13

Amen, I agree. Science scientists are not computer scientists, and I really think there will be a lot of mistakes made over this.


#14

See https://danluu.com/integer-overflow/

In short, it’s fine for limited usage but the more widely you do integer overflow checking, the more it interferes with compiler analysis and optimizations. In other words it’s fine to let the user opt into it but you can’t really have it be the default without losing a ton of performance.


#15

Thank you for the pointer. This reference actually supports my statement that the slowdown is in the ballpark of 5%.


#16

I would rather take 10% longer and get the right answer, than get an answer instantly but unknown to me it replaced 6.023*10^23 with 1.20687e18

Otherwise you have to advertise Julia as being as easy as python, fast as c, as accurate as nostradamus.

At a minimum an error should be thrown rather than letting me run with a garbage number…


#17

And we can’t even use floating point arithmetic at all due to its numerous inaccuracies! Arbitrary precision rationals is what real scientists should be using!


#18

Thank you for dismissing without proper consideration the topic…


#19

Well that’s the thing. It has been discussed multiple times at length. This isn’t a question of “computer scientists vs scientist scientists”, it’s an issue of understanding the told that you’re using. There’s no real difference between this and say applying a statistical method that you don’t understand. In this case it did what it was supposed to do, just not what you expected to do. I would consider that an opportunity to learn something and adjust my expectations.


#20

I claim that security doing integer operations is worth a penalty of 5%. This is the proposal you should address, not a supposed ignorance of the workings of a computer. That this proposal has some value is reflected in the fact that quite a few popular languages (ruby and python, for instance) were ready to pay the much higher penalty (slow everything by a factor of probably around 5) of using BigInt by default.