I have a totally noob question here.
I know that there are GPUArrays
for which broadcasting and various linear algebra stuff is overloaded, so it gets executed on GPU.
Suppose, that I define a function which works with GPUArrays and may be with some constants. When this function is compiled, does it gets compiled to GPU as a whole, so that it can be executed on GPU without data exchange with CPU? (Provided that I transferred data to GPU beforehand)
If I plug such a function into DifferentialEquations.jl as a right-hand-side of ODE, and integrate it with 4-th order Runge Kutta, for example, will the whole Runge Kutta with the right hand side be compiled on GPU, so that it can be executed as a whole without the data exchange with CPU?