Column iterator

mschauer · September 14, 2018, 5:18pm

I’d like to define a plain iterator over the columns of a matrix in style of julienned arrays. If I take the state to be

struct Slice{T} <: AbstractVector{T}
    a::Matrix{T}
    idx::Int
end

what do I have to define of size, length, getindex, setindex!, axes
to obtain a performant broadcast as below

a = rand(2, 1000)
x = rand(2)

y = Slice(a, 1) # a[:, 1]
y .= x

I’d like to avoid views and leverage that columns are contiguous.

bramtayl · September 14, 2018, 6:39pm

Whats the problem with views? Are they allocating? Are you trying to call mutate in place functions?

mschauer · September 14, 2018, 7:05pm

Yeah, views are allocating and have overhead as they solve a considerably more difficult problem.

mschauer · September 14, 2018, 7:16pm

To be sure, I benchmarked and it appears that creating the view is not a problem, the costs are stemming from broadcasting:

yy = rand(2, 10000)
y = rand(2)
using BenchmarkTools
function f5(yy, x) 
    for i in 1:size(yy, 2)
        y = Slice1(yy, i)
        for k in eachindex(x)
            @inbounds y[k] = x[k]
        end
    end
end
function f6(yy, x) 
    for i in 1:size(yy, 2)
        y = view(yy, :, i)
        for k in eachindex(x)
            @inbounds y[k] = x[k]
        end
    end
end



@btime f5($yy, $y)
@btime f6($yy, $y)
  25.072 μs (0 allocations: 0 bytes)
  27.160 μs (0 allocations: 0 bytes)

with





struct Slice1{T} <: AbstractVector{T}
    S::Matrix{T}
    idx::Int
    sz::Base.OneTo{Int}
end
@inline function Slice1(A::TA, i::Int) where {TA<:AbstractArray{T,2}} where {T} 
    Slice1(A, i, Base.OneTo(size(A)[1]) )
end


@inline size(S::Slice1) = (length(S.sz),)
@inline axes(S::Slice1) = (S.sz,)


@inline Base.IndexStyle(::Type{<:Slice1}) = IndexLinear()
@inline function setindex!(S::Slice1, x, k)
    @boundscheck checkbounds(S.S, k, S.idx)
    @inbounds S.S[k, S.idx] = x
end

mschauer · September 14, 2018, 7:46pm

Compare with


function f7(yy, x) 
    for i in 1:size(yy, 2)
        y = view(yy, :, i)
        @inbounds y .= x
    end
end

@btime f7($yy, $y)

 169.822 μs (10000 allocations: 468.75 KiB)

raminammour · September 14, 2018, 9:38pm

I think broadcast is not strictly the problem. It acts like a function barrier and does not inline, therefore forcing the allocations of views. You can define the inner loop as a separate function and inline/noinline it to check.

Also, your benchmarking is probably skewed by copying two numbers a 1000 times. If these numbers are representative of the application, fine. If not, you will discover that, while the allocations don’t disappear, the performance is almost the same on a 200x1000 example.

Cheers!

mschauer · September 15, 2018, 5:41am

You are right, the allocation because of the function barrier is a problem. The benchmark is representative of the application (low-dim ODEs). I don’t understand why though, the view is just an immutable struct with two fields… what has to be allocated there?

bramtayl · September 15, 2018, 2:42pm

My philosophy on this is just wait for Base to catch up. I’m sure eventually they’ll be able to figure out a way to get rid of the views allocations.

raminammour · September 17, 2018, 3:07pm

I think the real difference is in one case the views are completely optimized out (the inlined case), in the other they are not. I had to unfortunately go back to v0.6 to see @code_warntype which marks y <optimized out>
( I find the output in 0.7 illegible).

Topic		Replies	Views
Can I avoid allocations when broadcasting over slices? Performance	3	173	August 4, 2023
Why is there memory allocation and time difference in @views and view? Performance memory-allocation	9	485	March 29, 2021
Understanding major order performance when broadcasting in column vs row operations Performance question , array , benchmark	9	1005	June 21, 2021
Allocation with Subset of Matrix and Views General Usage memory-allocation	2	350	June 20, 2022
Too many allocations when indexing with slices Performance indexing , memory-allocation	16	2749	August 17, 2018

Column iterator

Related topics