I’d like to seize the opportunity to mention SmallCollections.jl. It provides allocation-free set and (variable size) vector types. In the end there are no new ideas, but the package allows to write rather straightforward code that is fairly efficient:
using SmallCollections, Chairmarks
function count_matches(selection, q, p = length(q))
v = zeros(SmallVector{8,Int}, p+1) # works for p <= 7
for s in selection
i = length(intersect(q, s))+1
@inbounds w = setindex(zero(v), 1, i)
v += w
end
v
end
function all_matches(P, p, selection)
map(q -> count_matches(selection, q, p), Subsets(P, p))
end
P = 22
p = 5
Q = length(Subsets(P, p))
all = collect(Subsets(P, p)) # only needed to generate the selection
selection = unique(rand(all) for _ in 1:Q÷5);
On my not very fancy laptop I get:
julia> @b all_matches($P, $p, $selection)
92.456 ms (2 allocs: 1.808 MiB)
I didn’t try multi-threading or other tricks. My point is that with very little effort one can already get quite a bit of mileage.