Yeah, I think so. As suggested on the document, if we want to use FastChain both u0 and p need to be on GPU. Then, I would suppose the output after solving the ODE problem will also be CuArray (I used gpu(solve(…))). The relavent code is shown as below.

This is atually the same as what you posted in the tutorial. I only modified a bit to make it a GPU version. The error only occurs on the line with vcat().

```
ode_data = gpu(Float32.(hammerstein_system(ex)))
nn_dudt = FastChain(
FastDense(2, 8, tanh),
FastDense(8, 1))
u0 = Float32[0.0]|> gpu
p = initial_params(nn_dudt)|> gpu
#dudt2_(u, p, t) = dudt2(u,p)
function dudt(u, p, t)
#in_vect = vcat(u[1])
#nn_model(gpu(in_vect), p)
#nn_dudt(vcat(u,CUDA.@allowscalar(ex[1])), p)
i = vcat(u[1], CUDA.@allowscalar(ex[Int(round(10*0.1))])) |>gpu
nn_dudt(i, p)
end
prob_gpu = ODEProblem(dudt, u0, tspan, nothing)
# Runs on a GPU
function predict_neuralode(p)
_prob_gpu = remake(prob_gpu,p=p)
gpu(solve(_prob_gpu, Tsit5(), saveat = tsteps, abstol = 1e-8, reltol = 1e-6))
end
function loss_neuralode(p)
pred =predict_neuralode(p)
N = length(pred)
l = sum(abs2, ode_data[1:N] .- pred)/N
return l, pred
end
res0 = DiffEqFlux.sciml_train(loss_neuralode,p ,ADAM(0.01), maxiters=10)
```