Passing views to function without allocation

I want to perform matrix vector multiplication of size (N^2, M) * (M), and then reshape the output vector of size (N^2) to a (N, N) matrix.

To do this without allocation, I tried using 1d view of the matrix.
Directly calling mul! with views indeed did not allocate, but when I called mul! inside a function, allocation occurs.

Why is this? Is there a way to keep using view inside the function without allocation?

I would also appreciate if there are better ways to achieve my goal of doing matrix vector multiplication and then reshaping the output vector to a matrix.

N = 50
M = 100
a1 = zeros(N*N)
a2 = zeros(N, N)
b = rand(N*N, M)
c = rand(M)

function test1!(a1, b, c)
    mul!(a1, b, c)
end
function test2!(a2, b, c)
    mul!((@view a2[:]), b, c)
end
@btime mul!($a1, $b, $c)
@btime mul!($(@view a2[:]), $b, $c)
@btime test1!($a1, $b, $c)
@btime test2!($a2, $b, $c)

Output:

53.083 μs (0 allocations: 0 bytes)
53.100 μs (0 allocations: 0 bytes)
53.023 μs (0 allocations: 0 bytes)
53.280 μs (2 allocations: 80 bytes)

EDIT: Following suggestions from the replies, I tried reshape(a2, :) and vec(a2), but both still allocate.

function test3!(a2, b, c)
    mul!(reshape(a2, :), b, c)
end
function test4!(a2, b, c)
    mul!(vec(a2), b, c)
end
@btime mul!(reshape($a2, :), $b, $c)
@btime mul!($vec(a2), $b, $c)
@btime test3!($a2, $b, $c)
@btime test4!($a2, $b, $c)

Output:

53.223 μs (2 allocations: 80 bytes)
53.178 μs (2 allocations: 80 bytes)
53.303 μs (2 allocations: 80 bytes)
53.144 μs (2 allocations: 80 bytes)
1 Like

This isn’t an answer to your question, but could you use reshape on a2 instead of a @view?

1 Like

Do you mean the following?

function test3!(a2, b, c)
    mul!(reshape(a2, :), b, c)
end
@btime mul!(reshape($a2, :), $b, $c)
@btime test3!($a2, $b, $c)

Both work, but still allocate.

53.223 μs (2 allocations: 80 bytes)
53.178 μs (2 allocations: 80 bytes)
1 Like

What happens if you use vec instead of view or reshape?

It still allocates.

function test4!(a2, b, c)
    mul!(vec(a2), b, c)
end
@btime mul!($vec(a2), $b, $c)
@btime test4!($a2, $b, $c)
53.303 μs (2 allocations: 80 bytes)
53.144 μs (2 allocations: 80 bytes)

Yeah, I guess you have to pay for the construction of the view wrapper in this case.

The reason you don’t get allocations here

is that you create the view first, before splicing it into the benchmark expression.

Try

@btime mul!(@view ($a2)[:], $b, $c)

instead. This is a fairer comparison.

1 Like

This isn’t right either. Here you are interpolating in the function vec, and not interpolating the a2 variable at all.

It should be

@btime mul!(vec($a2), $b, $c)

I see. Thanks.
Both @btime mul!(view($a2, :), $b, $c) and @btime mul!(vec($a2), $b, $c) allocates 53.279 μs (2 allocations: 80 bytes).

So, vec, view, and reshape all allocates some memory.

Then, my question is: Is there any way to pass 1d view of a matrix (or something similar) to mul! without allocating any memory?
(Maybe I am so used to Fortran…)

Is https://github.com/JuliaArrays/UnsafeArrays.jl relevant to this use case?
I was reluctant to use it because it says UnsafeArrays.jl would be unnecessary after v1.5, which is what I am using (v1.5.3).

I’m afraid I cannot answer that. It’s not clear to me exactly when allocations will be elided or not. Is this a significant issue? Does it impact performance?

May be UnsafeArrays is still an option? This is in Julia 1.5.0.

using UnsafeArrays
N = 50;
M = 100;
a1 = zeros(N*N);
a2 = zeros(N, N);
b = rand(N*N, M);
c = rand(M);

function test2!(a2, b, c)
    mul!(uview(a2,:), b, c)
end

@btime test2!($a2, $b, $c);
  46.500 μs (0 allocations: 0 bytes)
1 Like

Thank you all for the answers!

The allocation does not change the performance significantly, I just asked out of curiosity.

It seems uview of UnsafeArrays is the answer to my question. Thanks!

This really isn’t views; it’s the reshape (which implicitly happens when view’ing with linear indexing). This is because reshapes of Array are handled in C. Use a Julia-native reshape and it gives you what you want:

julia> @btime mul!(reshape(view($a2, :, :), :), $b, $c); # reshape of a view
  11.295 μs (0 allocations: 0 bytes)

julia> @btime mul!(Base.ReshapedArray($a2, (length($a2),), ()), $b, $c);
  11.114 μs (0 allocations: 0 bytes)

We should make the ReshapedArray constructor easier to use (and export it).

11 Likes

Rather than yet another way to make this reshape, any reason we can’t make vec(a2), view(a2, :) and reshape(a2, :) behave like this?

1 Like

It’s unfortunately breaking.

https://github.com/JuliaLang/julia/issues/24237

4 Likes

I see… There’s no reason however why vec(a2) , view(a2, :) and reshape(a2, :) have to allocate, right? A future Array type implemented in Julia could avoid the allocation while staying backward compatible (or am I missing something?).