@synchronize is not working consistently for this function I am trying to write, here is what is happening. This function is supposed to first divide everything in the first column by the topmost element of column 1 (3), then synchronizes, then copies the second and third elements of column 1 into the second and third elements of columns 2 and 3.
About half the time I run it the correct answer (a matrix where the second column is 1 2 3) comes out, other half of the time incorrect answer (matrix where second column is 1 6 9) comes out. However, it always copies column 3 correctly.
Does anyone know what is happening and how I can fix this? Thank you!
A = CuArray([3.0 1.0 1.0; 6.0 4.0 2.0; 9.0 7.0 7.0])
backend = get_backend(A)
@kernel function test_gpu!(A)
I, J = @index(Global, NTuple)
if I <= 2 && J == 1
A[I+1, 1] = A[I+1, 1]/A[1, 1]
end
@synchronize
if I > 1 && I <= 3 && J <= 3 && J > 1
A[I, J] = A[I, 1]
end
@synchronize
end
test_gpu!(backend, 64)(A, ndrange = (3,3))
A
(Also, happy Thanksgiving to everyone celebrating!)