Overflow behaviour - how to explain

question

#1

Hello,

I’m preparing a little introductory presentation for my team on Julia. I’d like to do some slides on where people should be careful. Overflow bahaviour is certainly one of them imho. I just don’t know how to explain why there is this possibility of overflow. Is it just because if there were automatic promotion to bigint and bigfloat, the code would be slower? How much slower by the way? Just so I can give an order of magnitude.

Thanks for any explanation! I’m not in CS, nor are my team :slight_smile:


A plea for int overflow checking as the default
#2

How much slower by the way?

You should check sth like pypy for a JIT that optimize it to extreme and it’s hard to tell how much overhead it adds.

Julia is designed so that it doesn’t have to rely on any of those less reliable optimizations so it has a really simple compiler. With that the overhead will be >1000x.


#3

Testing it:

immutable MyInt{T<:Integer}
    a::T
end
Base.:+(i::MyInt{Int}, ii::Int) =  
    i.a>typemax(Int)-ii ?  MyInt(BigInt(i.a)+ii) :  MyInt(i.a+ii)
Base.:+(i::MyInt{BigInt}, ii::Int) = MyInt(i.a+ii)

function addone(out, n)
    for nn=1:n
        out += 1
    end
    out
end

using BenchmarkTools
@show @benchmark addone(1, 10^4)
@show @benchmark addone(MyInt(1), 10^4)
@show @benchmark addone(MyInt(BigInt(1)), 10^4)
nothing

gives:

@benchmark(addone(1,10 ^ 4)) = BenchmarkTools.Trial:
  memory estimate:  0.00 bytes
  allocs estimate:  0
  --------------
  minimum time:     7.686 ns (0.00% GC)
...

@benchmark(addone(MyInt(1),10 ^ 4)) = BenchmarkTools.Trial:
  memory estimate:  156.27 kb
  allocs estimate:  10001
  --------------
  minimum time:     127.549 μs (0.00% GC)
...

@benchmark(addone(MyInt(BigInt(1)),10 ^ 4)) = BenchmarkTools.Trial:
  memory estimate:  781.32 kb
  allocs estimate:  40003
  --------------
  minimum time:     1.088 ms (0.00% GC)
 ...

So about 20,000x slower. One of the big performance killers is that the function is not type-stable.


#4

Nice for this benchmarking. The hit is really heavy. So, it’s mostly a matter that the functions become not type-stable?


#5

Could this be what you’re looking for?

http://ucidatascienceinitiative.github.io/IntroToJulia/Html/WhyJulia

The example I use isn’t overflow and instead it’s 2^(-5), but yes it’s type-stability that’s the reason for this “lack of safety” (but for good reasons!).

Note that page is part of a larger series:

http://ucidatascienceinitiative.github.io/IntroToJulia/

Maybe this resource could be helpful here?


#6

Changing to
Base.:+(i::MyInt{Int}, ii::Int) = i.a>typemax(Int)-ii ? MyInt((i.a)+ii+2) : MyInt(i.a+ii)
makes it type-stable and results in 5micro. So, most of the slow-down is from the conditional. Note however, that using a type-unstable function “poisons” the code which uses it and makes it slow.


#7

Thanks for the links, this will certainly fuel some of my slides :slight_smile: Especially the “WhyJulia”, it explains very well where newcomers needs to look for safety.

Noted!


#8

It should be noted that there are many fancy tricks you can do to make BigInts faster and even combine the basic integer type with the BigInt type, but it inevitably adds complications and removes transparency – and makes it much harder to reason about the performance of one’s code. It also tends to make the representation of system integers incompatible with how C/Fortran represents them. So it all depends on your priorities. We should, however, have faster BigInts that are stored inline when possible.


#9

This is also far from the whole story. You’re most often not just adding in a loop. If you do some other work inbetween that is memory (not cache) bound, then it will not matter much; and latency of branch will be hidden by other (memory) latencies.

[With unrolling of loops, a clever optimizer could also optimize away some checks.]

[If SIMD integer use was common (I guess not…) I aslo assume you could check many overflows at once…]

I would like overflow traps to get added as an option to Julia later (command line switch). The type here would be a first step, so not critical to have the switch. I’m not sure if I could swap out the current integer type for my type given the same source code…

I tried (probably what you want or might often be enough):

julia> Base.:+(i::MyInt{Int}, ii::Int) =
         i.a>typemax(Int)-ii ? throw("Overflow") : MyInt(i.a+ii)
@benchmark(addone(1,10 ^ 4)) = BenchmarkTools.Trial: 
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     7.00 ns (0.00% GC)
...

@benchmark(addone(MyInt(1),10 ^ 4)) = BenchmarkTools.Trial: 
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     2.65 μs (0.00% GC)

[Before on my machine

@benchmark(addone(MyInt(1),10 ^ 4)) = BenchmarkTools.Trial: 
  memory estimate:  156.27 kb
  minimum time:     106.41 μs (0.00% GC)

#10

In practice, with 64-bit integers, overflow is not a serious concern in my experience.

If you are counting actual things (bytes, loop iterations) etcetera, Int64 will not overflow. If you are doing number theory, then you need BigInt. In real applications, you know which of those two circumstances you are in. (I think Stefan made this point somewhere a long time ago?)

(On 16- and even 32-bit machines, of course, matters were much different.)

(BigFloat is a totally different story; it’s not practical to automatically promote to BigFloat regardless of the language or performance, because that requires automation of floating-point error analysis, which seems to be almost impossible in practice [inflated claims about “unums” notwithstanding].)