Tools for partitioning matrix columns?

HenriDeh · September 20, 2023, 9:08am

Hello,

I am looking for a convenient package to partition matrices into smaller matrices (ie. cut along the 2nd dim). However, I specifically need to be able to make partitions of unequal sizes.

Let me illustrate with an example. I have a collection of matrices of different 2nd dims sizes:
ìnputs = [rand(4,i) for i in 1:10] # array of size (4,55)
each of these are inputs to pass to a neural network, so in order to avoid 10 separate calls, I first lazily hcat them using LazyArrays:

input = ApplyArray(hcat, inputs...)
output = nn(input) # array of size, say, (2,55)

What I need is a way to partition output along the columns to recover a vector of matrices of sizes [(2,1), (2,2)...(2,10)]. This can be lazily or eagerly, I’m not sure yet which is the most efficient.

Thank you for any recommendation.

digital_carver · September 20, 2023, 9:58pm

Lazy partition is probably the more efficient option here, since it’s columnwise. I’m not aware of any package that provides this, but you can write something like:

julia> function colpartitions(M, nparts)
         @assert nparts*(nparts + 1)/2 == size(M, 2)
         Base.require_one_based_indexing(M)
         c = 1
         out = Vector{typeof(@view(M[:, 2:3]))}(undef, nparts)
         for i in 1:nparts
           out[i] = @view(M[:, c:c+i-1])
           c += i
         end
         out
       end
colpartitions (generic function with 1 method)

julia> outputs = colpartitions(output, 10);

julia> size.(outputs)
10-element Vector{Tuple{Int64, Int64}}:
 (2, 1)
 (2, 2)
 (2, 3)
 (2, 4)
 (2, 5)
 (2, 6)
 (2, 7)
 (2, 8)
 (2, 9)
 (2, 10)

mikmoore · September 20, 2023, 10:31pm

Note that in the above suggestion, Vector{SubArray{eltype(M), 2}} is a container with incompletely typed elements (see the type of my example, below – SubArray has 4 parameters) so will cause dynamic dispatch. This may result in slower performance, depending on where your bottlenecks are. (The above post has since been adjusted to determine the type programmatically, removing the issue I raised.)

For this reason, I go out of my way to use map or generators when the element type might be complicated. That way the compiler does the work for me. Here’s my version:

julia> function colpartitions(M::AbstractMatrix, cols_per_partition)
               partition_stop = cumsum(cols_per_partition)
               all(>=(0), cols_per_partition) || error("partitions must have nonnegative size")
               axes(M,2) == 1:last(partition_stop) || error("columns of M must be 1:sum(cols_per_partition)")
               partitions = map(eachindex(partition_stop)) do i
                       colrange = get(partition_stop,i-1,0)+1:partition_stop[i]
                       return view(M, :, colrange)
               end
               return partitions
       end
colpartitions (generic function with 2 methods)

julia> colpartitions(rand(0:9,2,6), 1:3)
3-element Vector{SubArray{Int64, 2, Matrix{Int64}, Tuple{Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, true}}:
 [9; 8;;]
 [9 4; 4 3]
 [4 0 5; 3 8 4]

With some extra effort, one could make this return a Tuple (rather than Vector) when provided a Tuple of cols_per_partition, but I didn’t go that far here.

sylvaticus · September 21, 2023, 5:25am

Not designed for efficiency and working on relative shares rather than absolute values, but BetaMl.partition allows to partition a collection of N-arrays on any of the N dimensions where any of the Nc dimensions can be different size…

Topic		Replies	Views
Is there a simple/intuitive way to partition a matrix by arbitrary strides? Like i General Usage matrices	9	1168	February 24, 2021
Partitioning a matrix in batches New to Julia	2	1590	May 22, 2020
Partitioning a vector General Usage	7	349	January 31, 2024
Splitting and Summing Sparse Arrays General Usage question , sparse	5	794	August 18, 2022
Efficient ways to implement a (distributed) matrix-matrix product? Numerics package , linearalgebra , distributed , sparse , matrices	2	629	February 27, 2024

Tools for partitioning matrix columns?

Related topics