# Poor BigInt performance

The function here is not exactly Fibonacci, so -5 to 256 covers only 4 `nfib` values (1,2,9,44). But it’s still an elephant (maybe younger).

1 Like

Reusing only 2 numbers reduces the computing time by almost 2. That’s the elephant.

Other thing I should’ve also considered is that `n-2` and `n-1` occurs for most return values, so saving the conversion allocations help much more than saving allocations of a couple return values. We can do that for the 4 values, though this doesn’t elide the allocations of `n-big2` or `n-big1` of equal values.

Weirdly, the change in the OP to n-2 and n-1 doesn’t improve runtime for me (probably version/system differences), but doing it for the 4 values does by ~2.5x. The number of allocations are more similar, 1.13G > 921.01M > 351.79M. So this particular elephant isn’t so big.

``````julia> begin
const big1 = big(1)
const big2 = big(2)
const big3 = big(3)
const big9 = big(9)
const big4 = big(4)
const big44 = big(44)
function nfib4(n) # meaning I check up to n=4
if n == big1      big1
elseif n == big2  big2
elseif n == big3  big9
elseif n == big4 big44
else (nfib4(n-big2)+nfib4(n-big1))*n
end
end
end
nfib4 (generic function with 1 method)

julia> @time nfib4(big(40))
36.879051 seconds (351.79 M allocations: 6.990 GiB, 24.84% gc time)
12306496796223503426182275975073985352177589230680

julia> @time nfib4(big(40))
32.710493 seconds (351.79 M allocations: 6.990 GiB, 23.03% gc time)
12306496796223503426182275975073985352177589230680

julia> nfib2(n) = n > big2 ? (nfib2(n-big2)+nfib2(n-big1))*n : n
nfib2 (generic function with 1 method)

julia> @time nfib2(big(40))
91.455750 seconds (921.01 M allocations: 18.299 GiB, 25.44% gc time)
12306496796223503426182275975073985352177589230680

julia> @time nfib2(big(40))
86.134056 seconds (921.01 M allocations: 18.299 GiB, 27.48% gc time)
12306496796223503426182275975073985352177589230680

julia> nfib(n) = n > 2 ? (nfib(n-2)+nfib(n-1))*n : n
nfib (generic function with 1 method)

julia> @time nfib(big(40))
89.184372 seconds (1.13 G allocations: 18.299 GiB, 27.74% gc time)
12306496796223503426182275975073985352177589230680

julia> @time nfib(big(40))
90.009020 seconds (1.13 G allocations: 18.299 GiB, 27.19% gc time)
12306496796223503426182275975073985352177589230680
``````

It isn’t clear to me why the parameter of `nfib` shouldn’t always be Int (it is never big). Only the return values should be BigInts. For example:

``````const big1 = big(1)
const big2 = big(2)
const big9 = big(9)
const big44 = big(44)
function nfib5(n) # meaning I check up to n=4
if n == 1      big1
elseif n == 2  big2
elseif n == 3  big9
elseif n == 4 big44
else (nfib5(n-2)+nfib5(n-1))*n
end
end
``````

Well in OP’s functions `n::BigInt` was necessary to return a BigInt instead of overflowing Int, but you’re right that it can be decoupled. The `*n` part would convert and allocate a BigInt I think. Avoiding all allocations in `n-1` and `n-2` helps further on my system:

``````julia> @time nfib5(40)
18.999958 seconds (195.44 M allocations: 3.495 GiB, 27.15% gc time)
12306496796223503426182275975073985352177589230680

julia> @time nfib5(40)
18.529561 seconds (195.44 M allocations: 3.495 GiB, 26.41% gc time)
12306496796223503426182275975073985352177589230680
``````

Only a <5x speedup over the original (but try on your own systems for your numbers), so there’s surely more to it than just interning to save allocations.

1 Like

There is a specialized `*(x::BigInt, y::Int)` which should be faster, as BigInt multiplication is essentially long multiplication with Int digits.

1 Like

These two checks are not necessary in the hotter path, as the ‘cached’ `n=3` and `n=4` results preclude these cases (unless directly called).

I mean, checking for `n=3` and `n=4` should be first.

That’s true, but my thinking was I probably encounter the `n==1` cases more often in the recursive call tree. I haven’t tried reversing the cases but I did write the method as indexing a global tuple of the 4 big integers, and the performance didn’t seem to change for me, the checks were that cheap. I just posted the ordered branches because I thought it looked clearer.

are encountered zero times as the recursion stops with `n=3,4` but it doesn’t matter much (at least when I timed it)

1 Like

Oh I see now, of course.

I made something similar, but using `Union{Int, Vector{Int}}`, where for non-small integers `Vector{Int}` represents the vector of Limbs, and is handled by the low-level MPN layer of GMP (instead of the MPZ layer for `BigInt`).

But there are a bunch of operations which I didn’t implement yet, so for these I resort to converting to `BigInt`, which is somewhat slow. It’s essentially a packaged version of this PR: https://github.com/JuliaLang/julia/pull/17015, where some global `BigInt` are used to avoid allocation costs when converting to `BigInt`. But CI fails on windows for some reason, I suspect because of that, and nowadays it’s incorrect to use thread-local objects like that as tasks can migrate, so another solution is needed, the most straightforward would be probably to use task-local storage.

It’s generally faster than `BigInt`, but not always. But here it looks pretty much worth it:

``````julia> nfib(n) = n > 2 ? (nfib(n-2)+nfib(n-1))*n : n
nfib (generic function with 1 method)

julia> @time x = nfib(big(40))
35.327526 seconds (1.13 G allocations: 18.299 GiB, 36.83% gc time)
12306496796223503426182275975073985352177589230680

julia> @time x = nfib(XInt(40))
1.881859 seconds (46.37 k allocations: 3.538 MiB)
12306496796223503426182275975073985352177589230680
``````

My goal was to release this package once I implemented few more operations, but I unfortunately haven’t touched this project for 2.5 years. If there is interest, I could revive it and publish it soon in its current state, maybe some people would like to contribute some of the missing functions? Also, in order for `XInt` to be reliable, some careful review of the implemented algorithms would be needed.

As for the overhead compared to `Int` when only small integers are actually used:

``````julia> @btime nfib(18)
5.535 μs (0 allocations: 0 bytes)
44750731559645106

julia> @btime nfib(XInt(18))
46.697 μs (0 allocations: 0 bytes)
44750731559645106

julia> @btime nfib(BigInt(18))
336.929 μs (28415 allocations: 484.35 KiB)
44750731559645106
``````
9 Likes

That looks great! Definitely publish it (with big warnings that it’s not ready to be used yet).

2 Likes

I’d be very interested in contributing, regardless of the current state. Is this in a repository anywhere?

1 Like