# Struggling to get scipy.newton performance using NonlinearSolve.jl

Hi everyone! I’m relatively new to Julia (1.7.2), coming from a Matlab and Python background. I really enjoy the language, but I also struggle sometimes in cases where in Matlab / numpy I would simply use vectorization and it would be relatively fast.

I’m playing around with a nonlinear equation, the dispersion relation for ocean waves, which I’m trying to solve for k:
\omega^2 = gk \cdot tanh(kd)

In the formulation of my problem, \omega and k are vectors of length N and d is a scalar. I am using NonlinearSolve and setup the problem such that constant parameters in the optimization are provided in the NamedTuple p:

using NonlinearSolve

function dispersion(k::AbstractVector{<:Real}, p::NamedTuple)
@. k * p.G * tanh(k * p.d) - p.ω^2
end

# Setup parameters
N = 1000
G, d, ω = 9.8, 30.0, 2π * collect(range(0.05, 1, N))
p = (G=G, d=d, ω=ω)

# Starting point: first order approximation
k0 = @. ω^2 / G / tanh(ω * sqrt(d / G))
prob = NonlinearProblem{false}(dispersion, k0, p);
solver = solve(prob, NewtonRaphson(), tol=1e-9)


Doing some benchmarks against my original python implementation using scipy.newton shows that the julia version above dramatically underperforms, especially as N increases:

I played around with the tolerances as well as the Optim.jl package (which gave much worse results with different solvers). From the scipy docs I know that newton defaults to the secant method. I have not found a method yet that is faster than the NonlinearSolve approach above, but I’m also quite new to Julia, so I don’t know the entire landscape of solver packages yet :).

Does anybody know what causes the poor performance, or what solver package I should consider instead? Any thoughts are much appreciated!

It is also used to find roots of scalar functions (in contrast to scipy.optimize.root). While your Julia example solves a system of N equations. Is that what you want? Difficult to tell what’s going on without your Python implementation to compare against.

Perhaps you are looking for something like this?

using Roots

N = 10^3
G, d, ω = 9.8, 30.0, 2π .* range(0.05, 1, N)  # don't collect the range!!
k0 = @. ω^2 / G / tanh(ω * sqrt(d / G))

dispersion(k, G, d, ω) = k * G * tanh(k * d) - ω^2  # no dots, just a scalar function
disproots(G, d, ω, k0) = find_zero(k->dispersion(k, G, d, ω), k0)


Benchmark:

1.7.2> @benchmark disproots.($G,$d, $ω,$k0)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max):  279.100 μs …  1.387 ms  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     295.400 μs              ┊ GC (median):    0.00%
Time  (mean ± σ):   310.837 μs ± 51.768 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

█▆██▆▇▅▅▄▃▃▃▂▂▂▂▂▁▁▁                                         ▂
████████████████████████▇▇██▆▇▇▇█▆▇▇▇█▇▇▇▆▇▇▇▅▇▅▅▅▅▅▄▆▅▆▆▅▆▅ █
279 μs        Histogram: log(frequency) by time       536 μs <

Memory estimate: 7.94 KiB, allocs estimate: 1.


Seems to be approx 50-100x faster than your code at N = 1000, and runtime scales linearly with N, while your code seems to scale like N^2.

It seems to solve a somewhat different problem, as @skleinbo says, you are solving a system of N equations, that just happen to be completely decoupled, so the results are equal up to ~10^-16.

Yes, if it’s a bunch of scalar rootfinds, you should use a scalar rootfinder n times instead of a vector Newton method since the latter will require building and factorizing a large Jacobian. With that in mind, here’s the NonlinearSolve code:

using NonlinearSolve

function dispersion(k, p)
k * p.G * tanh(k * p.d) - p.ω^2
end

# Setup parameters
N = 1000
G, d, ω = 9.8, 30.0, 2π * collect(range(0.05, 1, N))
p = map(ω -> (G=G, d=d, ω=ω), ω)

# Starting point: first order approximation
k0 = @. ω^2 / G / tanh(ω * sqrt(d / G))
prob = NonlinearProblem{false}.(dispersion, k0, p)
solver = solve.(prob, (NewtonRaphson(),), tol=1e-9)


which as expected is really fast:

using BenchmarkTools
julia> @benchmark solver = solve.(prob, (NewtonRaphson(),), tol=1e-9)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max):  128.400 μs …   5.394 ms  ┊ GC (min … max): 0.00% … 96.94%
Time  (median):     144.500 μs               ┊ GC (median):    0.00%
Time  (mean ± σ):   152.696 μs ± 125.333 μs  ┊ GC (mean ± σ):  2.62% ±  3.17%

▅▆ ▁ ▃█ ▃
▁██▆█▃█████▆▄▃▂▂▂▁▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
128 μs           Histogram: frequency by time          262 μs <

Memory estimate: 78.69 KiB, allocs estimate: 18.


150 μs ain’t too shabby.

2 Likes

I should indeed have mentioned that I’m looking for a bunch of scalar rootfinds. And in python, my code looks like this (works with k0 and omega as vectors):

k = scipy.newton(dispersion, k0, args=(d, omega))


@ChrisRackauckas dotting the solve is indeed the proper approach, and in retrospect rather obvious - my narrow vectorized Matlab/python mind could not come up with that yet. It is twice as fast as the scipy.newton method, so I’m happy!

Thanks all for the quick response and helping me getting started with this amazing language!

1 Like