Since I got great feedback on my first post, I turn to you again with another question. In my larger code base I want to iterate over a special kind of binary matrices.
First fact about the binary matrices, they have exactly one 1 in each column, so I modeled them as integer rowVectors
, one entry for each column, the integer is equal to the row where the 1 sits.
Second fact, I can split the matrix into blocks, and for each block I know the number of 1s for each row.
(I think it’s not necessary to go into more detail about the mathematical specification of those matrices, but I can if somebody wants to know more, to maybe use a whole different approach).
To iterate over all of them, I gather the vector indicating the row sums for each block in the tuple vBlocks
, create an iterator of all multiset_permutations()
that fulfill those row sums for each block, and then get all combinations of those using Iterators.product()
, the code for this is
using Combinatorics
using BenchmarkTools
function getCBConfigsForPDVTripleOptTest(vecR, vecB)
vBlocks = (rComp .* vecB for rComp in vecR)
blockIterators = [createBlockIteratorOpt(v) for v in vBlocks]
for rowVectorBlocks in Iterators.product(blockIterators...)
# rest of code operates allocation free in this for loop
end
end
function createBlockIteratorOpt(rowCounts::Vector{Int})
n = sum(rowCounts);
permVector = zeros(Int, n,)
col = 0;
for row in eachindex(rowCounts)
for l in 1:rowCounts[row]
permVector[col + l] = row;
end
col += rowCounts[row];
end
return multiset_permutations(permVector, length(permVector))
end
The performance is just terrible, in general I look at cases where sum(vecR)
and sum(vecB)
are maximum 6 but for vecR = [2,1,1]
and vecB = [2,1,1]
I already get
@btime getCBConfigsForPDVTripleOptTest(vecR, vecB)
10.180 ms (363617 allocations: 23.11 MiB)
and the allocations just explode for more higher cases, making those cases unable to run. I had an implementation that used binomial choosing of column indices instead of multiset permutations of a row vector, but it was performing even worse (and combinatorially they should be the same)
Once again I really appreciate any help