Best way to store integer partitions

Tqft · December 14, 2019, 4:20pm

This is just a toy example but I was wondering what the best (memory and performance wise) way is to store partitions of a fixed integer n, i.e. lists of positive integers which sum up to n.

Currently, I’m using Array{Int8,1} for the partitions and Array{Array{Int8,1},1} to store all the partitions. But I think this is not optimal.

julia> @time P = partitions(10)
0.000011 seconds (53 allocations: 5.328 KiB)

julia> @time P = partitions(90)
4.301507 seconds (56.63 M allocations: 6.239 GiB, 34.52% gc time)

This is quite a lot of memory allocated compared to what I actually have to store. The number of partitions of 90 for example is 56634173, so this is much less than 56634173*90/1000^3 = 5.1GiB of data (ignoring pointers etc.).

What’s the best way to do this?

leiteiro · December 14, 2019, 6:28pm

If this fits your needs better:

 n = 100: 167.598273 seconds (15 allocations: 4.813 KiB)

then look at
https://github.com/OpenLibMathSeq/IntegerSequences.jl/blob/master/src/Partitions.jl

Tqft · December 14, 2019, 7:03pm

Interesting, thank you. But I fear this is useless. This implementation does not store the partitions, it just prints (visits) them. In fact, in the code for the benchmark you find the comment “# – first comment out println in VISIT”.

This would be equivalent to not push the partitions into the array. If I just do the visit, I’m a bit more efficient:

julia> @time P = partitions(100)
0.250746 seconds (8 allocations: 576 bytes)

But that’s not the point.

kristoffer.carlsson · December 14, 2019, 7:11pm

What I don’t really understand is that you show the results from @time but that gives the total number of allocations in the algorithm that creates P but doesn’t say much about the size of P itself. Maybe you just do a bunch of non-needed allocations in partitions?

Tqft · December 14, 2019, 7:46pm

How can I get the actual size of P? I tried

julia> P=partitions(2)
2-element Array{Array{Int8,1},1}:
[2]
[1, 1]

julia> sizeof( P)
16

julia> Base.summarysize( P)
139

So, maybe it’s 16 bytes for 3 8-bit integers plus pointers etc!? I don’t know what summarysize is.

I checked allocations using julia --track-allocation=user and as far as I can tell the only allocations came from the push!(P,part) for each partition found.

kristoffer.carlsson · December 14, 2019, 8:02pm

Instead of storing things as [[1,1,1], [1, 2], [3]] you could store it as

data = [1, 1, 1, 1, 2, 3]
offsets = [1, 4, 6, 7]

and then you get the offsets at i as data[offsets[i]:offsets[i+1]-1]. Would save you all the overhead from a bunch of small arrays.

Tqft · December 15, 2019, 9:04am

This sounds like a very good idea, cheers.

Topic		Replies	Views
Why so many allocations in the loop on partition General Usage performance , allocations , itertoolspartition	7	446	July 14, 2023
Weird allocation issue for fill! with ArrayPartition in Julia < 1.6 Performance array , memory-allocation	1	354	May 17, 2021
Boost the performance of partition 1:n into equivalence classes New to Julia performance	0	296	July 13, 2021
Help speedup a brute-force solution Performance combinatorics	13	1077	June 18, 2021
Processus stop for large array of images Performance array , memory-allocation	2	322	August 18, 2022

Best way to store integer partitions

Related topics