How to reset the elements of a 3x3 matrix without any memory allocation?

I want to set the element value of a 3x3 matrix to newly calculated values in a small function that gets called a lot of times. I’ve tried broadcasted assignment to a temporary 3x3 matrix, but that seems to introduce a few allocations that the compiler won’t optimize away. So the question is that if there is any solutions to this problem other than manually setting the elment values one by one.

BTW I’ve tried these methods. The element-by-element assignment is too verbose.

A = zeros(3,3)
@time A .= [
    1 2 3;
    4 5 6;
    7 8 9;
]; # 3 allocations
@time A[:] .= (1,2,3,4,5,6,7,8,9); # 4 allocations
@time A[:] .= [1,2,3,4,5,6,7,8,9]; # 7 allocations
@time begin
A[1] = 1;
A[2] = 2;
A[3] = 3;
A[4] = 4;
A[5] = 5;
A[6] = 6;
A[7] = 7;
A[8] = 8;
A[9] = 9;
end # 0 allocations

vec(A) .= (values...) should be fast and minimally-allocating, but you’ll probably see a much bigger performance gain by switching to static arrays for small arrays modified in inner loops.

2 Likes

Some of those allocations are you allocating an input array to write into A, some of those came from running @time in the global scope, and some came from A[:] .= B making a view that did an allocating reshape of the matrix A to a vector (though the allocation is small because the underlying buffer of values is reused). There was a thread yesterday pointing out this allocating reshape happening for views of matrices. vec(A) does reshape(A, length(A)) so that does the small allocations too.

1 Like

Thanks for all your helpfull advice! These are my new test results. Using immutable static array to reinitialize matrix A resulted in 0 allocations, which only requires an additional macro. I guess the problem is solved. Thanks again!

@time vec(A) .= (1,2,3,4,5,6,7,8,9); # 2 allocs
@time @view(A[:]) .= (1,2,3,4,5,6,7,8,9); # 3 allocs
@time @views A[:] .= (1,2,3,4,5,6,7,8,9); # 4 allocs
@time vec(A) .= [1,2,3,4,5,6,7,8,9]; # 4 allocs

@time vec(A) .= @SVector [1,2,3,4,5,6,7,8,9]; # 2 allocs
@time A .= @MMatrix [
    1 2 3;
    4 5 6;
    7 8 9;
]; # 2 allocs
@time A .= @SMatrix [
    1 2 3;
    4 5 6;
    7 8 9;
]; # 0 allocs
2 Likes

The key to 0 allocations there was that the input array is now a Matrix, so you did not have to make a vector view A[:], which would’ve triggered an allocating vector reshape.
I think stillyslalom was suggesting that A be an MMatrix too. You’d still need to preallocate, but the small fixed size can be useful.

1 Like

Or, depending on the nature of the operations, a SMatrix! Even for cases where you’re replacing all the values in a small array in each loop, SMatrix can be marginally faster than MMatrix:

julia> using BenchmarkTools

julia> function f(A, n = 1000)
           s = zero(eltype(A))
           for i = 1:n
               A *= rand()
               s += sum(A)
           end
           s
       end
f (generic function with 2 methods)

julia> function g(A, n = 1000)
           s = zero(eltype(A))
           for i = 1:n
               A .*= rand()
               s += sum(A)
           end
           s
       end
g (generic function with 2 methods)
julia> @btime f(@SMatrix rand(3, 3))
  3.788 μs (0 allocations: 0 bytes)
0.9307550454767469

julia> @btime g(@MMatrix rand(3, 3))
  4.543 μs (1 allocation: 80 bytes)
2.402649106652987
1 Like

Thanks for the helpful information. My function runs a lot faster now by simply switching to static arrays.

StaticArrays seems like a fine solution, but in general I’d say just use a loop.

This is a bit low-level but also allocation free:

copyto!(A, 1, (1, 2, 3, 4, 5, 6, 7, 8, 9), 1, 9)

Actually, I misunderstood, and thought you were changing A to being an SMatrix. If A is a regular Matrix, and performance is the goal, then definitely use a loop, not @SMatrix or copyto! or anything like that.

function foo!(A)
    for i in 1:9
        A[i] = i;
    end
    return A
end

bar!(A) = copyto!(A, 1, (1, 2, 3, 4, 5, 6, 7, 8, 9), 1, 9)

function baz!(A)
    A .= @SMatrix [1 2 3; 4 5 6; 7 8 9]
    return A
end

Benchmarks:

julia> @btime foo!(A) setup=(A=zeros(3,3));
  6.400 ns (0 allocations: 0 bytes)

julia> @btime bar!(A) setup=(A=zeros(3,3));
  26.205 ns (0 allocations: 0 bytes)

julia> @btime baz!(A) setup=(A=zeros(3,3));
  10.010 ns (0 allocations: 0 bytes)

If you think that @inbounds is safe, you can do

function foo_inbounds!(A)
    for i in 1:9
        @inbounds A[i] = i;
    end
    return A
end

and get:

julia> @btime foo_inbounds!(A) setup=(A=zeros(3,3));
  2.800 ns (0 allocations: 0 bytes)

Most of the time, when you need optimal performance, use a loop.

7 Likes

Thanks for the infomation. It really helps me understand more of julia. But in my use case, I don’t think the new matrix value can be calculated in a for loop. I forgot to mention the details in my question. So my goal is to calculate 3x3 rotational matrices for each input angle, the code is something like this.

Rx = @SMatrix [
        1          0         0;
        0  cos(x) -sin(x);
        0  sin(x)  cos(x);
]

Are you sure rotation matrices are actually what you want to use here? Quaternians are usually faster.

3 Likes

Well, the original code written in matlab that I’m referencing uses rotation matrices. I’m simply trying to reproduce the code in julia. So using rotational matrices saves a lot of trouble for me.

That is important information. It was not clear to me how the values were actually passed to the function.

But in that case I suggest that you don’t use a SMatrix to update the input matrix A, but instead directly use the SMatrix itself:

function Rot(x) 
    return @SMatrix [
        1          0         0;
        0  cos(x) -sin(x);
        0  sin(x)  cos(x)]
end

And then you write A = Rot(value). Also make all your other vectors and matrices into SArrays.

1 Like

If you’re using Rot in a hot loop, then you can make it faster:

julia> using StaticArrays

julia> function Rot(x)
           return @SMatrix [
                   1          0         0
                   0       cos(x)    -sin(x)
                   0       sin(x)     cos(x)]
       end
Rot (generic function with 1 method)

julia> function Rot2(x)
           s, c = sincos(x)
           return @SMatrix [
                   1   0    0
                   0   c   -s
                   0   s    c ]
       end
Rot2 (generic function with 1 method)

julia> using BenchmarkTools

julia> x = 1.3
1.3

julia> @btime Rot($x);
  18.857 ns (0 allocations: 0 bytes)

julia> @btime Rot2($x);
  8.100 ns (0 allocations: 0 bytes)

A nit: It is conventional in Julia for function names to be all lower case. Capitalized (actually, CamelCase) names are used to denote types and modules. See the relevant portion of the style guide.

2 Likes

I know, and normally follow it. But whenever I write a function that ‘constructs’ a rotation matrix, I prefer uppercase, probably because it is conceptually a constructor of an object, even though it’s not a type constructor, per se.

My comment was for the benefit of the OP, who I assume is not familiar with standard Julia conventions. I respect your expertise, which greatly exceeds my own!

1 Like

Ah, I see. Also, you are too kind😉

2 Likes