Profiling allocations?

angelv · April 22, 2020, 8:55am

Hello,

I wrote a simple function recursively and when testing it I found that it allocates a lot of memory.

julia> @time brute_force(10,63)
[...]
 32.055675 seconds (392.13 M allocations: 8.765 GiB, 28.16% gc time)

I thought this probably was due to all those function calls, so I wrote the same, but iteratively, and to my surprise the allocated memory was even larger:

julia> @time brute_force_it(10,63)
 41.741401 seconds (522.84 M allocations: 11.686 GiB, 29.09% gc time)

I have some possible candidates for these allocations, but since I don’t know exactly what triggers memory allocation in Julia, I want to avoid guessing and would like to profile the code for allocations. What is the recommended way to get this information?

(If I write this iterative code in Fortran/C I know that the memory allocations would be minimal, so I want to understand what Julia is doing)

Many thanks,
AdV

Tamas_Papp · April 22, 2020, 8:58am

The manual has a short section on this:

https://docs.julialang.org/en/v1/manual/profile/#Memory-allocation-analysis-1

Skoffer · April 22, 2020, 9:02am

You can also try TimerOutputs.jl package. With proper usage it can give you good insights of your code performance.

angelv · April 22, 2020, 9:59am

Thanks. I tried the way specified in the manual (I had overlooked this, sorry), and while it is a bit cumbersome, it works nice and gave me good allocation profiling for each line in the function of interest.

It turns out that most of the allocation comes from innocent-looking lines like:

c += 1

but where that c is of type BigInt.

So, (and perhaps this should be a different question somewhere else), do you know of any package to work with really big integers but without the overhead of BigInt’s? (I don’t need arbitrarily big ints and all the overhead that comes with it. I know the maximum numbers I have to deal with, so it would be sufficient to specify that I need, say, 50 digits, and then all the numbers in the code will be in that range, and if not, it would be sufficient to have wraparound behaviour as for the predefined integer types).

Any pointers appreciated
AdV

nilshg · April 22, 2020, 11:53am

Try the forum search, I remember a couple of really long discussions about different ways to tackle BigInt’s in a highly performant ways - unfortunately I didn’t understand any of it as I never work with large integers, so I couldn’t summarise it for you, but there’s a lot of stuff out there.

non-Jedi · April 22, 2020, 11:58am

BitIntegers.jl has a Int256 with a typemax of 57896044618658097711785492504343953926634992332820282019728792003956564819967. I haven’t used it myself, so I can’t speak to the performance; also be aware of the caveats in the README. Are you sure that the 39 digits of Int128 are insufficient for your use-case? If you’re not sure, you could also use SaferIntegers.jl to see if any errors are thrown. There may be other packages in this space that I’m not aware of.

angelv · April 22, 2020, 1:57pm

Many thanks for the pointers. BitIntegers.jl has Int256, Int512 and Int1024 types! That’s great. I think I need Int512, and the first test was very nice (for the recursive function the allocated memory, and running time, went down dramatically).

With BigInt:

julia> @time brute_force(10,63)
:BigInt[59, 53, 51, 48, 47, 43, 42, 41, 37, 35, 33, 31, 26, 25, 23, 22, 20, 19, 18, 17, 14, 9, 6, 5, 1] 9.849302918817908e17 0
 32.820850 seconds (392.13 M allocations: 8.765 GiB, 27.37% gc time)
false

With Int512:

julia> @time brute_force(10,63)
:Int512[59, 53, 51, 48, 47, 43, 42, 41, 37, 35, 33, 31, 26, 25, 23, 22, 20, 19, 18, 17, 14, 9, 6, 5, 1] 9.849302918817908e17 0
  5.154979 seconds (5.38 k allocations: 331.125 KiB)
false

For the iterative function I still need to check what is going on, because the running time goes down from 52 to 24 seconds, but the allocated memory actually increases to 58GiB.

But overall very promising, thanks.

(And yes, I do need all those digits Well, at some point I will have to rethink the algorithm and perhaps then I can do it some other way, but at this moment I need to perform accurate addition/subtraction of really big numbers, with totals of about 100 digits).

GunnarFarneback · April 22, 2020, 2:32pm

Depending on what you do more exactly and how much output you have in the end, an option can be to do the computations with modular arithmetics, for a modulo that fits in Int64. The obvious drawback is that you will only get the result up to this modulo, but if you repeat it a number of times with different modulo you can recover the full result with the Chinese Remainder Theorem. This trick was used to compute the 171 digit number of legal positions on a 19x19 go board, a very memory-intensive computation. See section 5.5 of https://tromp.github.io/go/gostate.pdf for details.

angelv · April 22, 2020, 3:02pm

Interesting. I will check it up. Many thanks

Topic		Replies	Views
Poor BigInt performance Performance	32	1273	May 14, 2024
Threads are not speeding up computation a lot Performance threads	70	1184	March 20, 2025
How do I get the memory size of a BigInt? New to Julia	8	966	March 28, 2022
Performance / Memory differences between constructors of BigInts Performance performance , numbers , type , memory-allocation	5	337	August 16, 2023
Finding the memory allocation in some code General Usage performance	3	856	August 25, 2017

Profiling allocations?

Related topics