(Feel free to skip this part and look at the minimal example below)
After a long time, I had to continue again with some old code, that stillyslalom already posted here, and because is a different question, I created a new post.
So, in the end of the post we have the code to compute eigenvectors, given the eigenvalues:
using LinearAlgebra, GenericLinearAlgebra
import LinearAlgebra.eigen
function eigen(A::Array{T,2}) where T <: Union{BigFloat,Complex{BigFloat}}
ni, nj = size(A)
Λ = eigvals(A)
if norm(imag(Λ)) < eps(BigFloat)
Λ = real(Λ)
end
V = rand(eltype(Λ), nj, nj)
bkp = similar(V, nj)
for j = 1:nj
Δ = BigFloat(Inf)
bk = view(V,:,j)
f = qr(A  Λ[j]*I)
i = 0
while Δ > eps(BigFloat) && i < 10
bkp .= f\bk
ck = norm(bkp)
bkp ./= ck
Δ = norm(bkp  bk)
bk .= bkp
i += 1
end
end
return Λ, V
end
The eigenvector part seems to be a matrix multiplication operation. Then, for fun, I was debugging the code in hope to implement a GPU version of it.
My struggle is with the line bkp .= f\bk
.
A minimal example with Float64 type:
using LinearAlgebra
A = rand(4,4)
f = qr(A)
b = rand(4)
C = f\b
f
has type of LinearAlgebra.QRCompactWY{Float64,Array{Float64,2}}
, and stores matrices Q
and R
.
My questions is: can I convert the division operation into standard matrix multiplication between Q
,R
and b
?
(I don’t mind about high tunned algorithm, the GPU will do muliplication operations faster anyway)
Extra information for reference:

In the end. Looking at the internals of
qr.jl
file, the division operation becomes a Lapack callLAPACK.gemqrt!('L','T',A.factors,A.T,B)
. 
A simplest code that I would like to have in GPU is:
using ArrayFire
Agpu = AFArray(A)
fgpu = qr(Agpu)
bgpu = AFArray(b)
cgpu = fgpu\bgpu
If I run this code, I got a message telling me that adjoint
is not defined to AFArray
type. So, I defined it as:
ArrayFire.adjoint(A::AFArray) = A'
but I got a StackOverFlow message