Thanks for the link. So, just for anyone else who might be interested. It seems that the following is the most efficient if you want the full matrix to be updated at the end.
function hermitian!(A)
@inbounds for j in 1:N
A[j, j] = 2 * real(A[j, j])
for i in 1:(j-1)
A[i, j] = A[i, j] + conj(A[j, i])
end
end
@inbounds for j in 1:N
for i in j+1:N
A[i, j] = conj(A[j, i])
end
end
end
If you’re only interested in the upper diagonal part of the matrix, you can skip the second for loop, and gain a factor of 2 on efficiency.
function hermitian!(A::Matrix{Complex{T}}) where T
( N = size(A,1) ) == size(A, 2) || error("Matrix must be square")
@inbounds for j in 1:N
A[j, j] = 2 * real(A[j, j])
for i in 1:(j-1)
r1, i1 = reim(A[i,j])
r2, i2 = reim(A[j,i])
rc, ic = r1+r2, i1-i2
A[i, j], A[j, i] = Complex{T}(rc,ic), Complex{T}(rc,-ic)
end
end
end
Once A[j,i] is touched in the first loop, it is better to write to it while it is hot in the cache.
This simpler version would be as fast as Dan’s solution:
function hermitian!(A)
(N = size(A,1)) == size(A,2) || error("Matrix must be square")
@inbounds for i = 1:N, j = i:N
A[i, j] = A[i, j] + A[j, i]'
A[j, i] = A[i, j]'
end
end
Note that if A is an OffsetArray, Julia may terminate with this error:
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffba590f6c1 -- RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
in expression starting at none:0
RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
malloc at C:\WINDOWS\System32\msvcrt.dll (unknown line)
aligned_offset_malloc at C:\WINDOWS\System32\msvcrt.dll (unknown line)
@inbounds together with 1-based indexing is not advisable. Use axes. Many have argued lately that it’s fine to use 1:size(), but then remove @inbounds.
Or, simply, the way it is already, but without the @inbounds annotation.
@inbounds is for when the indices are provably inbounds. That could be achieved by using eachindex, axes, etc. Or by only accepting Array as input type, or by calling Base.require_one_based_indexing if you strongly prefer 1:N indexing.
For those of us who like 1-based indexing () but want performance and genericism without ripping up our code and making it unrecognizeable, shall this be the preferred idiom?
hermitian!(A::AbstractArray) = let A=reshape(A, size(A))
(N = size(A,1)) == size(A,2) || error("Matrix must be square")
@inbounds for i = 1:N, j = i:N
A[i, j] = A[i, j] + A[j, i]'
A[j, i] = A[i, j]'
end
end