Recent AI developments: Roformer (transformer w/Rotary Position Embedding) and DL to Rejuvenate Symbolic AI: Neural Production

No worries. Generally neural ODEs do worse in NLP. If there’s no natural ODE, it’s kind of pointless, except… we did recently show at ICML how to make neural ODEs into a recurrent network that automatically does hyperparameter optimization to choose the least amount of layers in a way that also improves training time.

For a full discussion on what that algorithm is rather interesting to ML frameworks as a software question, see the blog post:

I can’t say I know whether this will ever be “the thing” for NLP, but the blog post goes into why the algorithm is interesting from an AD perspective and how it hits the limitations of many software packages. I think this disconnect of quasi-static optimizers and the true adaptive nature of ODE solvers is precisely why you haven’t seen them showcased throughout a lot of ML: you hit a wall of what the frameworks will optimize, so without new frameworks the methods will seem very slow.

3 Likes