# Efficient and in-place computation of A + adjoint(A)

If `A` is any complex-valued matrix, is there a more efficient way of computing `A + adjoint(A)` in-place than the following ?

``````function hermitian(A, A_tmp)
A .+= A_tmp
end
``````

Benchmarking with

``````using LinearAlgebra
using BenchmarkTools

const N = 200
A = rand(ComplexF64, N, N)
A_tmp = zeros(ComplexF64, N, N)
@btime hermitian(A, A_tmp)
``````

yields `70.243 μs (0 allocations: 0 bytes)`, which seems quite slow for such a small matrix. Is there any other more efficient implementation ?

1 Like
1 Like

Thanks for the link. So, just for anyone else who might be interested. It seems that the following is the most efficient if you want the full matrix to be updated at the end.

``````function hermitian!(A)
@inbounds for j in 1:N
A[j, j] = 2 * real(A[j, j])
for i in 1:(j-1)
A[i, j] = A[i, j] + conj(A[j, i])
end
end
@inbounds for j in 1:N
for i in j+1:N
A[i, j] = conj(A[j, i])
end
end
end
``````

If you’re only interested in the upper diagonal part of the matrix, you can skip the second for loop, and gain a factor of 2 on efficiency.

A little bit faster (on my machine) is:

``````function hermitian!(A::Matrix{Complex{T}}) where T
( N = size(A,1) ) == size(A, 2) || error("Matrix must be square")
@inbounds for j in 1:N
A[j, j] = 2 * real(A[j, j])
for i in 1:(j-1)
r1, i1 = reim(A[i,j])
r2, i2 = reim(A[j,i])
rc, ic = r1+r2, i1-i2
A[i, j], A[j, i] = Complex{T}(rc,ic), Complex{T}(rc,-ic)
end
end
end
``````

Once `A[j,i]` is touched in the first loop, it is better to write to it while it is hot in the cache.

6 Likes

This simpler version would be as fast as Dan’s solution:

``````function hermitian!(A)
(N = size(A,1)) == size(A,2) || error("Matrix must be square")
@inbounds for i = 1:N, j = i:N
A[i, j] = A[i, j] + A[j, i]'
A[j, i] = A[i, j]'
end
end
``````
6 Likes

Note that if `A` is an `OffsetArray`, Julia may terminate with this error:

``````Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffba590f6c1 -- RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
in expression starting at none:0
RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
malloc at C:\WINDOWS\System32\msvcrt.dll (unknown line)
aligned_offset_malloc at C:\WINDOWS\System32\msvcrt.dll (unknown line)
``````

`@inbounds` together with 1-based indexing is not advisable. Use `axes`. Many have argued lately that it’s fine to use `1:size()`, but then remove `@inbounds`.

``````for i in axes(A,1)
``````

but how to index j?

``````for i in axes(A, 1), j in i:lastindex(A, 2)
``````

Or, simply, the way it is already, but without the `@inbounds` annotation.

`@inbounds` is for when the indices are provably inbounds. That could be achieved by using `eachindex`, `axes`, etc. Or by only accepting `Array` as input type, or by calling `Base.require_one_based_indexing` if you strongly prefer `1:N` indexing.

1 Like

Somebody came up with a really clever solution to this problem, it just needs to be fleshed out and implemented

Without restarting the whole offset array discussion again, `reshape(A, size(A))` should return a 1-based view for all arrays

``````julia> A = zeros(3:4, 4:5)
2×2 OffsetArray(::Matrix{Float64}, 3:4, 4:5) with eltype Float64 with indices 3:4×4:5:
0.0  0.0
0.0  0.0

julia> reshape(A, size(A))
2×2 Matrix{Float64}:
0.0  0.0
0.0  0.0
``````

There might be missing cases that need to be fixed, but this is what the function is expected to return.

For those of us who like 1-based indexing ( ) but want performance and genericism without ripping up our code and making it unrecognizeable, shall this be the preferred idiom?

``````hermitian!(A::AbstractArray) = let A=reshape(A, size(A))
(N = size(A,1)) == size(A,2) || error("Matrix must be square")
@inbounds for i = 1:N, j = i:N
A[i, j] = A[i, j] + A[j, i]'
A[j, i] = A[i, j]'
end
end
``````

Is this the notation for creating an OffsetArray of zeros? Isn’t that type piracy?

1 Like

I argue that `OffsetArray`s should be added to `Base`, in which case it wouldn’t be piracy anymore Yes, it’s type-piracy, and is expected to be removed in the next breaking release of OffsetArrays.

2 Likes