Saving values from GPU during Euler stepping in ODE

it is probably easier to port the CL version that I post instead of @sdanisch one at the cost of performance.

Right now CUArrays is rough to install because it needs Julia to be built from source in order to do its codegen. Thatā€™ll change with Juliaā€™s v0.7/1.0 release though. I still havenā€™t gotten it to work on v0.6 at all so YMMV.

In theory, you just need to prefix all math intrinsics with CUDAnative.
E.g.: log, sqrt, max.
stuff like @linearidx & gpu_call, gpu_rand! comes from GPUArrays and should work for CuArrays & CLArrays!

So basically anything, that doesnā€™t come from GPUArrays and is not pure Julia.

Have a look at: https://github.com/JuliaGPU/CUDAnative.jl/blob/master/src/device/libdevice.jl for a more exhaustive list of what functions you need to replace!

Hi,

I want to come back here for the gpu_call. It is here called with the default configuration = length(A). Is it the best way to set up the number of threads? How can this be optimised?

Tahnk you

Bests