Long training time using sciml_train on universal DAE

@ChrisRackauckas In my example, I am trying to build a UDE for a system consisting of 5 ODEs with complex, time-varying parameters. The code runs but the loss values barely change with number of iterations.