using CUDAdrv, CUDAnative
function kernel_a(a)
i = (blockIdx().x-1) * blockDim().x + threadIdx().x
a[i] = tanh(a[i])
return nothing
end
dev = CuDevice(0)
ctx = CuContext(dev)
len = 100
a = rand(Float32, len)
d_a = CuArray(a)
function gpu()
@cuda (1,100) kernel_a(d_a)
end
gpu()
The above code gives me “error in running finalizer: CUDAdrv.CuError(code=700, meta=nothing)” and various other errors if I try to use a math function such as tanh() inside kernel.
Is this the right way to use math functions?