Efficient and in-place computation of A + adjoint(A)

Ronan_Gautier · January 21, 2023, 12:24am

If A is any complex-valued matrix, is there a more efficient way of computing A + adjoint(A) in-place than the following ?

function hermitian(A, A_tmp)
  adjoint!(A_tmp, A)
  A .+= A_tmp
end

Benchmarking with

using LinearAlgebra
using BenchmarkTools

const N = 200
A = rand(ComplexF64, N, N)
A_tmp = zeros(ComplexF64, N, N)
@btime hermitian(A, A_tmp)

yields 70.243 μs (0 allocations: 0 bytes), which seems quite slow for such a small matrix. Is there any other more efficient implementation ?

Oscar_Smith · January 21, 2023, 12:36am

See https://github.com/JuliaLang/julia/pull/31836

Ronan_Gautier · January 21, 2023, 1:17am

Thanks for the link. So, just for anyone else who might be interested. It seems that the following is the most efficient if you want the full matrix to be updated at the end.

function hermitian!(A)
    @inbounds for j in 1:N
        A[j, j] = 2 * real(A[j, j])
        for i in 1:(j-1)
            A[i, j] = A[i, j] + conj(A[j, i])
        end
    end
    @inbounds for j in 1:N
        for i in j+1:N
            A[i, j] = conj(A[j, i])
        end
    end
end

If you’re only interested in the upper diagonal part of the matrix, you can skip the second for loop, and gain a factor of 2 on efficiency.

Dan · January 21, 2023, 1:43am

A little bit faster (on my machine) is:

function hermitian!(A::Matrix{Complex{T}}) where T
    ( N = size(A,1) ) == size(A, 2) || error("Matrix must be square")
    @inbounds for j in 1:N
        A[j, j] = 2 * real(A[j, j])
        for i in 1:(j-1)
            r1, i1 = reim(A[i,j])
            r2, i2 = reim(A[j,i])
            rc, ic = r1+r2, i1-i2
            A[i, j], A[j, i] = Complex{T}(rc,ic), Complex{T}(rc,-ic)
        end
    end
end

Once A[j,i] is touched in the first loop, it is better to write to it while it is hot in the cache.

Seif_Shebl · January 22, 2023, 6:19pm

This simpler version would be as fast as Dan’s solution:

function hermitian!(A)
    (N = size(A,1)) == size(A,2) || error("Matrix must be square")
    @inbounds for i = 1:N, j = i:N
        A[i, j] = A[i, j] + A[j, i]'
        A[j, i] = A[i, j]'
    end
end

uniment · January 23, 2023, 12:43am

Note that if A is an OffsetArray, Julia may terminate with this error:

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffba590f6c1 -- RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
in expression starting at none:0
RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
RtlAllocateHeap at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
malloc at C:\WINDOWS\System32\msvcrt.dll (unknown line)
aligned_offset_malloc at C:\WINDOWS\System32\msvcrt.dll (unknown line)

DNF · January 23, 2023, 7:57am

@inbounds together with 1-based indexing is not advisable. Use axes. Many have argued lately that it’s fine to use 1:size(), but then remove @inbounds.

photor · January 23, 2023, 8:55am

for i in axes(A,1)

but how to index j?

DNF · January 23, 2023, 9:34am

for i in axes(A, 1), j in i:lastindex(A, 2)

Or, simply, the way it is already, but without the @inbounds annotation.

@inbounds is for when the indices are provably inbounds. That could be achieved by using eachindex, axes, etc. Or by only accepting Array as input type, or by calling Base.require_one_based_indexing if you strongly prefer 1:N indexing.

uniment · January 24, 2023, 2:45am

Somebody came up with a really clever solution to this problem, it just needs to be fleshed out and implemented

jishnub · January 24, 2023, 5:56am

Without restarting the whole offset array discussion again, reshape(A, size(A)) should return a 1-based view for all arrays

julia> A = zeros(3:4, 4:5)
2×2 OffsetArray(::Matrix{Float64}, 3:4, 4:5) with eltype Float64 with indices 3:4×4:5:
 0.0  0.0
 0.0  0.0

julia> reshape(A, size(A))
2×2 Matrix{Float64}:
 0.0  0.0
 0.0  0.0

There might be missing cases that need to be fixed, but this is what the function is expected to return.

uniment · January 24, 2023, 7:55am

For those of us who like 1-based indexing () but want performance and genericism without ripping up our code and making it unrecognizeable, shall this be the preferred idiom?

hermitian!(A::AbstractArray) = let A=reshape(A, size(A))
    (N = size(A,1)) == size(A,2) || error("Matrix must be square")
    @inbounds for i = 1:N, j = i:N
        A[i, j] = A[i, j] + A[j, i]'
        A[j, i] = A[i, j]'
    end
end

DNF · January 24, 2023, 9:51am

Is this the notation for creating an OffsetArray of zeros? Isn’t that type piracy?

uniment · January 24, 2023, 10:04am

I argue that OffsetArrays should be added to Base, in which case it wouldn’t be piracy anymore

jishnub · January 24, 2023, 10:14am

Yes, it’s type-piracy, and is expected to be removed in the next breaking release of OffsetArrays.

Topic		Replies	Views
Is it possible for complex matrix multiplication and adjoint speed to catch up with matlab? New to Julia question	38	1338	October 11, 2023
Faster way for matrix multiplication including a Hermitian matrix with special properties New to Julia quantum , linearalgebra , optimization	6	1114	August 26, 2023
Possible slow fallback of adjoint matrix multiplication for different element types Performance linearalgebra	0	276	May 10, 2021
Fastest way to compute adjoint(x)Ax in CUDA? GPU question , cuda	19	156	November 2, 2024
Fast computation of Grammian of matrix Performance question	11	1488	April 6, 2021

Efficient and in-place computation of A + adjoint(A)

Related topics