Approximating A+B*log with NeuralNetwork

Honza9723 · November 24, 2020, 7:57pm

Dear All,

I am working on an economics project which utilizes neural networks as a method for approximation of model solutions (those models are systems of differential/difference equations). For some versions of these models, I have analytical solutions that I can use as a trial. For some problems, it works really great. However, I encountered a few really weird cases, where neural network isn’t able to approximate (or more precisely, Flux wasn’t able to train it) particularly simple functions. The most striking case was following.

V(k) = A + B*log(k)

Where A = -27.02875f0, B = 0.64935064f0 and k is discretized by vector of 2500 points from 0.05 to 0.31. I tried to approximated this simple function using neural network. I used 4 layer network, composed of Bent identity functions, 32 units per layer, besides that, I also tried softplus functions combined with final identity and a large network of Relu units trained on GPU (few hundreds of Relus per layer). Neither of those attempts succeeded, regardless of which variant of gradient descent I used (stochastic vs full batch, ADAM, Nestorov,…). Instead of converging towards the solution, network formed simple line.

It looks like convergence towards local minima of the loss function. Is there some clever way, how to tackle this type of problem (I tried minibatching without much success)? I managed to “solve it” by using network with 12 hidden layers, but that sounds to me like an overkill, also the convergence was really slow and fragile. As a loss function, I used simple mean squared error (It worked well for other functional equations that I solved using neural network).

function ℒ(x)
    𝕷 = sum((𝒱.(x) - φ(x)).^2)
    return 𝕷
end

Where 𝒱 is the function to be approximated and φ is the neural network.

Full code

#(1) Install and initialize packages
using Pkg
Pkg.add("Plots")
Pkg.add("Parameters")
Pkg.add("LinearAlgebra")
Pkg.add("CUDA")
Pkg.add("Flux")
Pkg.build("Flux")
Pkg.add("Random")
Pkg.add("Distributions")
Pkg.add("ForwardDiff")
using Plots
using Parameters
using LinearAlgebra
using CUDA
using Flux
using Random
using Distributions
using ForwardDiff

ϰ = 2500
A = -27.02875f0
B = 0.64935064f0
kl = 0.05
ku = 0.31

kGrid = reshape(rand(Uniform(kl,ku),ϰ,1),1,ϰ)
kkGrid = collect(range(kl,ku,length=ϰ))

𝒱(k) = A + B*log(k)


bent(x) = (sqrt(x^2+1)-1)/2 + x

φ = Flux.Chain(Dense(1,32,bent),Dense(32,32,bent),
Dense(32,32,bent),Dense(32,1,bent))

θ = Flux.params(φ)

function ℒ(x)
    𝕷 = sum((𝒱.(x) - φ(x)).^2)
    return 𝕷
end

Data = [kGrid]
opt = AMSGrad(0.001)


cb = () -> println(ℒ(kGrid))
@time Flux.@epochs 5000 Flux.train!(ℒ,θ,Data,opt,cb=cb)

vGrid = 𝒱.(kkGrid)
wGrid = φ(kkGrid')'

Plots.plot(kkGrid,vGrid,label="true",legend=:bottomright)
Plots.plot!(kkGrid,wGrid,label=["neural" "true"])

Any guidance with this type of problem? Is there some stupid mistake in my code that causes the problem, or it it something deeper? It is hard for me to believe, that neural networks can’t approximate A+ B*log(x) without much effort.

Topic		Replies	Views
Unable to solve a differential equation using neural network General Usage flux , neural-network	2	213	September 10, 2023
How to efficiently and precisely fit a function with neural networks? Machine Learning performance , flux , optimization , neural-network , approx	16	3654	January 1, 2022
Approximating a Quadratic Function with Flux Machine Learning	6	1612	May 30, 2019
Gradient-Free Neural Network Optimization Machine Learning	3	327	April 4, 2023
Misbehaving model / bad neural net architecture for the job? Machine Learning flux	2	300	November 5, 2022

Approximating A+B*log with NeuralNetwork

Related topics