I want to set the element value of a 3x3 matrix to newly calculated values in a small function that gets called a lot of times. I’ve tried broadcasted assignment to a temporary 3x3 matrix, but that seems to introduce a few allocations that the compiler won’t optimize away. So the question is that if there is any solutions to this problem other than manually setting the elment values one by one.
BTW I’ve tried these methods. The element-by-element assignment is too verbose.
A = zeros(3,3)
@time A .= [
1 2 3;
4 5 6;
7 8 9;
]; # 3 allocations
@time A[:] .= (1,2,3,4,5,6,7,8,9); # 4 allocations
@time A[:] .= [1,2,3,4,5,6,7,8,9]; # 7 allocations
@time begin
A[1] = 1;
A[2] = 2;
A[3] = 3;
A[4] = 4;
A[5] = 5;
A[6] = 6;
A[7] = 7;
A[8] = 8;
A[9] = 9;
end # 0 allocations
vec(A) .= (values...)
should be fast and minimally-allocating, but you’ll probably see a much bigger performance gain by switching to static arrays for small arrays modified in inner loops.
Some of those allocations are you allocating an input array to write into A
, some of those came from running @time
in the global scope, and some came from A[:] .= B
making a view that did an allocating reshape
of the matrix A
to a vector (though the allocation is small because the underlying buffer of values is reused). There was a thread yesterday pointing out this allocating reshape
happening for view
s of matrices. vec(A)
does reshape(A, length(A))
so that does the small allocations too.
Thanks for all your helpfull advice! These are my new test results. Using immutable static array to reinitialize matrix A resulted in 0 allocations, which only requires an additional macro. I guess the problem is solved. Thanks again!
@time vec(A) .= (1,2,3,4,5,6,7,8,9); # 2 allocs
@time @view(A[:]) .= (1,2,3,4,5,6,7,8,9); # 3 allocs
@time @views A[:] .= (1,2,3,4,5,6,7,8,9); # 4 allocs
@time vec(A) .= [1,2,3,4,5,6,7,8,9]; # 4 allocs
@time vec(A) .= @SVector [1,2,3,4,5,6,7,8,9]; # 2 allocs
@time A .= @MMatrix [
1 2 3;
4 5 6;
7 8 9;
]; # 2 allocs
@time A .= @SMatrix [
1 2 3;
4 5 6;
7 8 9;
]; # 0 allocs
The key to 0 allocations there was that the input array is now a Matrix
, so you did not have to make a vector view A[:]
, which would’ve triggered an allocating vector reshape.
I think stillyslalom was suggesting that A
be an MMatrix
too. You’d still need to preallocate, but the small fixed size can be useful.
Or, depending on the nature of the operations, a SMatrix
! Even for cases where you’re replacing all the values in a small array in each loop, SMatrix
can be marginally faster than MMatrix
:
julia> using BenchmarkTools
julia> function f(A, n = 1000)
s = zero(eltype(A))
for i = 1:n
A *= rand()
s += sum(A)
end
s
end
f (generic function with 2 methods)
julia> function g(A, n = 1000)
s = zero(eltype(A))
for i = 1:n
A .*= rand()
s += sum(A)
end
s
end
g (generic function with 2 methods)
julia> @btime f(@SMatrix rand(3, 3))
3.788 μs (0 allocations: 0 bytes)
0.9307550454767469
julia> @btime g(@MMatrix rand(3, 3))
4.543 μs (1 allocation: 80 bytes)
2.402649106652987
Thanks for the helpful information. My function runs a lot faster now by simply switching to static arrays.
StaticArrays seems like a fine solution, but in general I’d say just use a loop.
This is a bit low-level but also allocation free:
copyto!(A, 1, (1, 2, 3, 4, 5, 6, 7, 8, 9), 1, 9)
Actually, I misunderstood, and thought you were changing A
to being an SMatrix. If A
is a regular Matrix
, and performance is the goal, then definitely use a loop, not @SMatrix
or copyto!
or anything like that.
function foo!(A)
for i in 1:9
A[i] = i;
end
return A
end
bar!(A) = copyto!(A, 1, (1, 2, 3, 4, 5, 6, 7, 8, 9), 1, 9)
function baz!(A)
A .= @SMatrix [1 2 3; 4 5 6; 7 8 9]
return A
end
Benchmarks:
julia> @btime foo!(A) setup=(A=zeros(3,3));
6.400 ns (0 allocations: 0 bytes)
julia> @btime bar!(A) setup=(A=zeros(3,3));
26.205 ns (0 allocations: 0 bytes)
julia> @btime baz!(A) setup=(A=zeros(3,3));
10.010 ns (0 allocations: 0 bytes)
If you think that @inbounds
is safe, you can do
function foo_inbounds!(A)
for i in 1:9
@inbounds A[i] = i;
end
return A
end
and get:
julia> @btime foo_inbounds!(A) setup=(A=zeros(3,3));
2.800 ns (0 allocations: 0 bytes)
Most of the time, when you need optimal performance, use a loop.
Thanks for the infomation. It really helps me understand more of julia. But in my use case, I don’t think the new matrix value can be calculated in a for loop. I forgot to mention the details in my question. So my goal is to calculate 3x3 rotational matrices for each input angle, the code is something like this.
Rx = @SMatrix [
1 0 0;
0 cos(x) -sin(x);
0 sin(x) cos(x);
]
Are you sure rotation matrices are actually what you want to use here? Quaternians are usually faster.
Well, the original code written in matlab that I’m referencing uses rotation matrices. I’m simply trying to reproduce the code in julia. So using rotational matrices saves a lot of trouble for me.
That is important information. It was not clear to me how the values were actually passed to the function.
But in that case I suggest that you don’t use a SMatrix to update the input matrix A
, but instead directly use the SMatrix itself:
function Rot(x)
return @SMatrix [
1 0 0;
0 cos(x) -sin(x);
0 sin(x) cos(x)]
end
And then you write A = Rot(value)
. Also make all your other vectors and matrices into SArrays.
If you’re using Rot
in a hot loop, then you can make it faster:
julia> using StaticArrays
julia> function Rot(x)
return @SMatrix [
1 0 0
0 cos(x) -sin(x)
0 sin(x) cos(x)]
end
Rot (generic function with 1 method)
julia> function Rot2(x)
s, c = sincos(x)
return @SMatrix [
1 0 0
0 c -s
0 s c ]
end
Rot2 (generic function with 1 method)
julia> using BenchmarkTools
julia> x = 1.3
1.3
julia> @btime Rot($x);
18.857 ns (0 allocations: 0 bytes)
julia> @btime Rot2($x);
8.100 ns (0 allocations: 0 bytes)
A nit: It is conventional in Julia for function names to be all lower case. Capitalized (actually, CamelCase) names are used to denote types and modules. See the relevant portion of the style guide.
I know, and normally follow it. But whenever I write a function that ‘constructs’ a rotation matrix, I prefer uppercase, probably because it is conceptually a constructor of an object, even though it’s not a type constructor, per se.
My comment was for the benefit of the OP, who I assume is not familiar with standard Julia conventions. I respect your expertise, which greatly exceeds my own!
Ah, I see. Also, you are too kind😉