You might want to check out this workshop which does this in full using the universal differential equation methodology:
DataDrivenDiffEq.jl is the library to look at for doing this:
https://datadriven.sciml.ai/dev/sparse_identification/sindy/
and the example mixes neural networks into the training methodology can be found here: