Hello everyone, I have a issue and I don’t even know the exactly cause.
I am using ArrayFire to speed up some matrix multiplications. However, I cannot retrieve the data unless I do some useless operation into my data. Here a basic example of working code:
using ArrayFire
screen = rand(100,100,3)
r = rand(50,3)
β = rand(50)
k₀ = 1
function getE_scat_GPU(screen, r, β, k₀)
nAtoms = size(r,1)
nΘ = size(screen,1)
nΦ = size(screen,2)
E_scat_gpu = AFArray(zeros(Complex{Float32},nΘ, nΦ ))
distance = similar(E_scat_gpu)
xₛ = AFArray(zeros(Float32,nΘ, nΦ ))
yₛ = similar(xₛ)
zₛ = similar(xₛ)
Rₛ = similar(xₛ)
temp_exp = similar(E_scat_gpu)
all_xₛ = AFArray(screen[:,:,1])
all_yₛ = AFArray(screen[:,:,2])
all_zₛ = AFArray(screen[:,:,3])
for j=1:nAtoms
xₛ = all_xₛ - Float32(r[j,1])
yₛ = all_yₛ - Float32(r[j,2])
zₛ = all_zₛ - Float32(r[j,3])
Rₛ = sqrt( xₛ^2 + yₛ^2 + zₛ^2 )
distance = im*Float32(k₀)*Rₛ
temp_exp = exp(distance)
E_scat_gpu += β[j]*temp_exp/distance
dummy = sum(E_scat_gpu) # comment this line
end
E_scat = Array(E_scat_gpu)
return E_scat
end
result_GPU = getE_scat_GPU(screen, r, β, k₀)
Don’t worry about the whole code, just focus on the line with variable called dummy
in the end of the loop. It is a operation unnecessary for the the logic of my program.
However, I need this last line to be able to retrieve the information inside the function - Actually I don’t need to do this operation, if I use e.g. println(sum(E_scat_gpu))
, works as well.
If I remove this lines, I get a matrix full of NaN
.
Any clues ???