Gradient-Free Neural Network Optimization

liamfdoherty · April 4, 2023, 9:50pm

I have a neural network which is real valued (i.e., maps into \mathbb{R}, with inputs in \mathbb{R}^{n}), but it has complex weights. Consequently, taking gradients of the loss function is complex differentiation, and the lack of analyticity gets in the way (i.e., it’s not necessarily differentiable anyway). I am using Flux, and all of the built-in optimizers appear to be gradient based, and seem to be causing issues with my applications, I suspect for this reason. Are there packages which are either plug-and-play or close to it that implement non-gradient based (e.g., trust region or otherwise) methods, and does anybody have experience with training networks like this?

ChrisRackauckas · April 4, 2023, 10:43pm

Using Optimization.jl is probably the easiest way to do this. There are wrappers over many derivative-free methods. NLopt’s NLopt.LN_COBYLA() is one that I’ve had a good amount of success with. See:

https://docs.sciml.ai/Optimization/stable/optimization_packages/nlopt/

I generally only use derivative-free with neural networks as a way to test, but you can definitely do it. It’s not going to be as fast but it’s “fine”.

liamfdoherty · April 4, 2023, 10:50pm

Thanks, Chris!

Yeah, it doesn’t need to be fast. I am using the neural net as a refinement on a faster approximation method that gives me a substantial “warm start”, so I’m mostly just concerned about making sure I don’t get bad gradients that get me back out of the neighborhood I’m supposed to be in.

ChrisRackauckas · April 4, 2023, 10:58pm

Just note that if you do this with Flux, you need to use restructure/destructure. It’s a bit easier to use Lux.jl for this kind of thing, though it’s not hard.

Topic		Replies	Views
Efficient Way of Taking First and Second Order Derivatives of a Neural Net Machine Learning	1	539	July 2, 2020
How to optimize neural network parameters with non-gradient algorithms? Specific Domains optimization , neural-network	2	110	December 26, 2024
Using a Neural Network to predict the parameters of a optimization Problem Machine Learning neural-network	6	946	November 15, 2023
Loss functions that involve gradients Machine Learning	4	605	November 22, 2022
Jacobian of a network in the loss function with Flux Machine Learning question	6	246	July 23, 2024

Gradient-Free Neural Network Optimization

Related topics