Is there a simple/intuitive way to partition a matrix by arbitrary strides? Like i

BridgeBot · February 23, 2021, 4:57pm

Is there a simple/intuitive way to partition a matrix by arbitrary strides? Like if I had a 20 x 5 matrix and wanted the output to be a vector/iterable of 10x5, 3x5, 4x5, 3x5 ? Essentially, the “split” part of split-apply-combine, but for a matrix, and based on partition size.
Theoretically:

foo(rand(20,5), [10,3,4,3])

Note that the original poster on Slack cannot see your response here on Discourse. Consider transcribing the appropriate answer back to Slack, or pinging the poster here on Discourse so they can follow this thread.
(Original message ) (More Info)

stevengj · February 23, 2021, 5:06pm

@views function splititerator(M::AbstractMatrix, rowlens::AbstractVector{<:Integer})
    sum(rowlens) <= size(M,1) || throw(DimensionMismatch())
    let s = cumsum([first(axes(M,1)); rowlens])
        return (M[s[i]:s[i+1]-1,:] for i = 1:length(s)-1)
    end
end

pdimens · February 23, 2021, 5:13pm

Answer on Slack provided by @mcabbott

julia> function foo(mat, sizes)
       axes(mat,1) == 1:sum(sizes) || error("bad sizes!")
       z = 0
       map(sizes) do s
         r = z+1:z+s
         z = s
         mat[r,:]
       end
       end;
julia> foo(rand(20,5), [10,3,4,3]) .|> size
4-element Vector{Tuple{Int64, Int64}}:
 (10, 5)
 (3, 5)
 (4, 5)
 (3, 5)

pdimens · February 23, 2021, 5:31pm

Second, faster answer provided by @mcabbott on Slack:

foo(mat, sizes::Tuple{}) = ()

@views function foo(mat, sizes::Tuple) 
    (mat[1:first(sizes),:], foo(mat[first(sizes)+1:end,:], Base.tail(sizes))...);
end

julia> foo(rand(20,5), (10,3,4,3)) .|> size
((10, 5), (3, 5), (4, 5), (3, 5))

stevengj · February 23, 2021, 6:20pm

Small, fixed-size tuples are fast (along with code like that will unroll at compile-time). On the other hand, if you have a list of hundreds of sizes, whose length is determined at runtime, using tuples is likely to be slower.

In this case, because you are talking about an list or generator of submatrices, where the limiting factor in performance is likely to be the subsequent matrix operations rather than generating the list, I wouldn’t tend to recommend tuple tricks — go for code simplicity instead.

mbauman · February 23, 2021, 6:31pm

As I mentioned in the slack thread, I’d split the problem. First create a function that splits a vector (or likely any iterator) into parts the way you’d like. Then use that to create your more complicated row splitting. For example, we have a builtin iterator that gives you splits of a constant size:

(@views(A[I, :]) for I in Iterators.partition(axes(A, 1), 4))

It’s much more composable this way.

rafael.guerra · February 23, 2021, 7:38pm

@mbauman, trying to follow your illuminating advice. Could you kindly comment on Julia-novice code below. Thanks in advance.

function splititr(b)
    c = cumsum([1;b])
    return [ci:ci+bi-1 for (ci,bi) in zip(c,b)]
end

split_mbauman(M,I0) = (@views(M[I, :]) for I in I0)

M = rand(20,5)
b = [10,3,4,3]
it = splititr(b)
R = collect(split_mbauman(M,it))
# splits in R[1], R[2], ... R[4]

mbauman · February 23, 2021, 8:32pm

Yeah, you could really use any of the above implementations for creating the splits — the idea is just to detangle the splitting of a 1-dimensional iterable/vector from the indexing into the matrix. It makes it easier to test and easier to use in different situations.

pdimens · February 24, 2021, 5:50am

@mcabbott also suggested this, which is similar to the proposed solution by @rafael.guerra :

function steps(vals, steps)
    i = firstindex(vals)
    (view(vals, i:(i+=s)-1) for s in steps)
end

mat = rand(20,2)
batches = steps(axes(mat,1), [10,3,4,3])

[view(mat,r,:) for r in batches]

Is this functionality (in general) something that can be a PR for Base.Iterators ?

mbauman · February 24, 2021, 3:26pm

The part that could potentially go into Iterators would be the 1-dimensional version. It’d be fairly complicated because it’d have to work on arbitrary iterators, but I’d also want it to have the fancy precomputing stuff for splitting ranges like Iterators.partition does.

Topic		Replies	Views
Divide a matrix into 4 submatrices General Usage	6	485	July 8, 2022
Tools for partitioning matrix columns? General Usage question	3	413	September 21, 2023
Dividing matrix into sections Performance	12	3167	April 21, 2018
Split vector into N (potentially unequal length) subvectors? General Usage	9	3659	March 3, 2023
Partitioning a matrix in batches New to Julia	2	1596	May 22, 2020

Is there a simple/intuitive way to partition a matrix by arbitrary strides? Like i

Related topics