Why not?
julia> a = CUDA.rand(2)
2-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
0.30272454
0.8437567
julia> function kernel(a,b)
a[threadIdx().x] = round(b[threadIdx().x]; digits=2)
return
end
julia> b = similar(a)
2-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
0.0
0.0
julia> @cuda kernel(b,a)
CUDA.HostKernel for kernel(CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1})
julia> b
2-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
0.3
0.0
You should include source code and error messages when opening topics like this.