About storing the results of Iterators.product into an array efficiently?

HMegh · June 11, 2024, 6:00pm

I have a finite set represented as a vector V::Vector{T} and I am trying to compute the power of this set V^d=V\times V\times \cdots \times V. That is, the Cartesian product of V with itself d-times, which is a finite set of vectors. At the end, I would like to store the result as an array with d rows.

Typically, I would use Iterators.product, which is fast, but returns tuples instead of vectors:

function cartesian_power_tuples(V::Vector{T},d) where T
    X=fill(V,d)
    iterable=Iterators.product(X...)
    return collect(iterable)[:]
end

I want to modify the code above and obtain an array instead of tuples. Here’s my naive implementation:

function cartesian_power_array(V::Vector{T},d) where T
    n=size(V,1)
    X=fill(V,d)
    iterable=Iterators.product(X...)

    A=Matrix{T}(undef,d,n^d)
    i=1
    for x in iterable
        for j=1:d A[j,i]=x[j]  end
        i+=1
    end
    return A
end

However, the second function is significantly more expensive than the first one:

V=rand(10)
@btime cartesian_power_tuples($V,3)
#1.640 μs (7 allocations: 47.12 KiB)
@btime cartesian_power_array($V,3)
#353.952 μs (6018 allocations: 242.83 KiB)

Any ideas where the allocations are coming from? Or, is there a better approach to do this?

stevengj · June 11, 2024, 6:10pm

function cartesian_power_matrix(V::Vector{T},d) where T
    X=fill(V,d)
    iterable=Iterators.product(X...)
    a = collect(iterable)
    return reshape(reinterpret(T, a), d, length(a))
end

Or, better yet, make sure d is a compile-time constant by storing it in a type:

function cartesian_power_matrix(V::Vector{T}, ::Val{d}) where {T,d}
    X = ntuple(_ -> V, Val{d}())
    iterable=Iterators.product(X...)
    a = collect(iterable)
    return reshape(reinterpret(T, a), d, length(a))
end

which gives timings:

julia> @btime cartesian_power_tuples($V,3);
  1.121 μs (5 allocations: 23.70 KiB)

julia> @btime cartesian_power_matrix($V,3);
  1.308 μs (8 allocations: 23.80 KiB)

julia> @btime cartesian_power_matrix($V,$(Val(3)));
  830.128 ns (2 allocations: 23.56 KiB)

Probably because your iteration is type-unstable, since the type of your iteration elements (x, a d-tuple) depends on a runtime value (d).

HMegh · June 11, 2024, 6:23pm

This explains everything. Thanks!

mcabbott · June 11, 2024, 7:12pm

Another way to write this is stack(Iterators.product(V,V,V); dims=2), or to accept d something like stack(Iterators.product(ntuple(Returns(V),3)...); dims=2). This isn’t quite as fast as collect+reinterpret, although I’m not sure why.

Dan · June 11, 2024, 9:08pm

And another way:

cartesianpower(v, ::Val{d}) where {d} = 
  [v[j[i]] for i in 1:d, j in vec(CartesianIndices(ntuple(_ -> eachindex(v), d)))]

Getting matching performance:

julia> @btime cartesianpower($V, Val(3));
  3.447 μs (2 allocations: 23.48 KiB)

stevengj · June 11, 2024, 9:49pm

Why vec?

jar1 · June 11, 2024, 9:52pm

GitHub - JuliaAPlavin/RectiGrids.jl is a simple package for creating Cartesian grids.

julia> using RectiGrids

julia> grid(1:5, 10:10:30)
2-dimensional KeyedArray(...) with keys:
↓   5-element UnitRange{Int64}
→   3-element StepRange{Int64,...}
And data, 5×3 RectiGrids.RectiGridArr{Base.OneTo(2), Tuple{Int64, Int64}, 2, Tuple{Nothing, Nothing}, Tuple{UnitRange{Int64}, StepRange{Int64, Int64}}}:
      (10)         (20)         (30)
 (1)      (1, 10)      (1, 20)      (1, 30)
 (2)      (2, 10)      (2, 20)      (2, 30)
 (3)      (3, 10)      (3, 20)      (3, 30)
 (4)      (4, 10)      (4, 20)      (4, 30)
 (5)      (5, 10)      (5, 20)      (5, 30)

Dan · June 11, 2024, 10:40pm

vec washes away the shape of CartesianIndices, as the OP asked for a 2D matrix and not a (D+1) matrix.

rocco_sprmnt21 · June 12, 2024, 6:02am

using Combinatorics
Combinatorics.multiset_permutations(repeat(V,d),d)
reduce(hcat,msp(V,d))

Topic		Replies	Views
Overflow with Iterators.product General Usage question , bug , iterators	7	145	July 2, 2024
Iterators.product is not type stable? General Usage type-stability , iterators , cartesianindices	20	3072	September 9, 2018
Why is the array version allocating, whereas the tuple version is not? New to Julia question	6	548	May 27, 2022
Filling an array with a generator that returns tuples General Usage performance , array , tuple , memory-allocation , generator	4	983	January 2, 2022
Work around splatting to avoid unecessary allocations Performance	2	2092	January 16, 2019

About storing the results of Iterators.product into an array efficiently?

Related topics