# Unable to solve a differential equation using neural network

I am trying to solve a simple differential equation dy/dt=-y using neural networks. The following code runs but the loss remains at 2.0 and I get a output of zero. Am I doing something wrong here?

``````using Flux
using Flux: train!
using Statistics
nn = f64(Chain(Dense(1, 16, relu), Dense(16,16,relu), Dense(16, 1, relu)))

function loss_function(nn,t)
prediction = nn(t)
derivative = gradient(prediction -> sum(prediction), prediction)
initial_condition_loss = (nn() - 1)^2
return mean((derivative .+ prediction).^2) + initial_condition_loss
end

ts = 0:0.1:2.0
ts=reshape(ts,(1,:))

for epoch in 1:1000
train!(loss_function, nn, [(ts,)], opt)
println(loss_function(nn,ts)," ",nn(ts))
end
``````

You are not calculating the derivative wrt. t here:

``````derivative = gradient(prediction -> sum(prediction), prediction)
``````

In fact, this gradient evaluates to all ones. If anything, you want something like

``````diagonal of jacobian(nn, ts)
``````

But then you are nesting AD passes – one over `t`, one over the parameters. I quickly tried something, but ran into Zygote errors (the infamous “mutating arrays error”).

Might I suggest a finite difference approach instead:

``````function loss_function(nn, t)
prediction = nn(t)
derivative = diff(prediction, dims=2) ./ diff(t, dims=2)
initial_condition_loss = (nn() - 1)^2
return sum((derivative .+ prediction[:, 2:end]) .^ 2) + initial_condition_loss
end
``````

edit: AD seems to work fine; see below.

This trains well for me with `1=>8=>1` layers and `tanh` activations. `relu` reliably produces zero gradients.

(Of course the SciML ecosystem has all the tools and conveniences for neural ode problems readily available.)

Thanks a lot!!! Apparently, the main issue was the relu activation. The following code gives fantastic results in 10 epochs!

``````using Flux: train!
using  Flux
using Statistics
using ForwardDiff
using Zygote
#using ProgressBar
actual(x) = exp(-2.0e0*x)
deltax = 0.1e0

x_train = hcat(collect(Float64, 1:deltax:5)...)
initial = x_train[:,1]
y_train = actual.(x_train)
predict = f64(Chain(Dense(1,8,tanh),
Dense(8,8,tanh),
Dense(8,1,tanh)))
function loss(predict, x, y)
f(x) = sum(predict(x))
#println(x)
#println(predict(x))
ans_bc = mean(abs2.(predict(initial) .- actual.(initial)))
return(ans_bulk+ans_bc)
end
data = [(x_train,y_train)]
for epoch in 1:100
train!(loss, predict, data, opt)
println(loss(predict,x_train,y_train))
end
y_predict=predict(x_train)
for i in 1:size(x_train)
println(x_train[1,i]," ",y_train[1,i]," ",y_predict[1,i])
end

``````

I am aware of the SciML ecosystem. I do not want to use a black-box. I am trying to understand the basics. Once I am comfortable with my understanding of the technique, I will start using neuralPDE.jl.

1 Like