I wrote a simple extension of the CuArrays.jl tutorial section on how to iterate a value function on the gpu.
Very nice. I wrote a similar tutorial for parallel VFI here, but it doesn’t do GPU.
It doesn’t look like solving the problem on the GPU is much faster than using threads on the CPU, but maybe it’s problem specific. Do you know if interpolation (splines) works on the GPU?
Everything works on the gpu. But you will have to implement most of the stuff yourself. Most packages won’t be available in your kernel function.
In terms of performance, I learned that this needs a lot of caution about the specifics of your hardware. I’m not an expert by any stretch of the imagination so I’m sure one could do better here.