For directly solving known ODEs, of course the choice of integrator is significant.
However, when learning a network that’s part of an ODE, it isn’t clear to me whether the integrator matters, or should matter. Does it make a significant difference? Would the trained NN embody behaviours of the integrator too or would it turn out to be somewhat independent of it? … like if trained with Tsit5 or say KahanLi8, do we already know whether we’d get NNs that can/cannot be used with the other integrator?