Would it be possible to use multiple CPU cores in Solving Nonlinear Equations?

LINGXIAO_QIN · April 18, 2024, 10:49am

I am trying to solve a system of nonlinear equations (~25 variables). Currently, I am using NonlinearSolve.jl, it works but I want to expedite the computation. Except for some strategies introduced here such as using StaticArray and specifying the sparse Jacob matrix, I am wondering if it’s feasible to leverage multiple CPU cores to speed up the computation. Currently, the solving process appears to utilize only a single CPU core.

ChrisRackauckas · April 18, 2024, 12:23pm

It will automatically multithread on large equations, but indeed your equation is small.

You can try AutoPolyseterForwardDiff and see if you get a speedup, but that’s dependent on whether your time is generally spent in the Jacobian construction.

Small nonlinear systems are generally not a problem with good parallelization in the algorithms, though it could be a nice research topic.

LINGXIAO_QIN · April 21, 2024, 3:11am

Thanks for your kind suggestion. My equations cannot be automatically differentiated, and thus AutoPolyseterForwardDiff is not a viable choice for me (I am using autodiff=AutoSparseFiniteDiff()). I asked this question because in MATLAB, there is a UseParallel=true option for fsolve to estimate gradients in parallel. But as you said, “NonlinearSolve.jl will automatically multithread on large equations”.

As a beginner in Julia, I think the slow computation speed is possibly because I don’t write the equations in an efficient way. I’ll look into the performance tips and see if I can improve my code.

avikpal · April 21, 2024, 4:04am

For a 25 variable problem, sparsity is most like going to slow down your problem rather that speed it up. Sparsity benefits kick in after around 1000 variables at the minimum Ill-Conditioned Nonlinear System Work-Precision Diagrams · The SciML Benchmarks

LINGXIAO_QIN · April 21, 2024, 7:24am

Thank you for your answer. That’s a really informative benchmark. In my 25-variable problem, the jacobian is tridiagonal, so the non-zeros of the matrix is (25+24*2)/(25*25)=11.68%. I compared the solving times for autodiff=AutoFiniteDiff() and autodiff=AutoSparseFiniteDiff(), and found that sparsity does speed up the calculation. However, my objective function seems to suffer from type instability (and maybe many unnecessary allocations), which may contaminate the comparison results. I’ll try to fix my equation constructions.

avikpal · April 21, 2024, 1:54pm

Ah for Tridiagonal, can you supply a prototype directly as jac_prototype = <...> where the prototype is of type Tridiagonal but the entries can be anything?

AutoSparse... will generate a sparse matrix, but in case of tridiagonal you can use better factorization and such if the prototype is given.

LINGXIAO_QIN · April 22, 2024, 12:53am

Thanks. By specifing jac_prototype = a Tridiagonal and using autodiff=AutoSparseFiniteDiff(), an approximately 30% speedup is achieved for my 25-variable problem.

Topic		Replies	Views
How to resolve a nonlinear equations system in GPU (parallel) Modelling & Simulations cuda , nonlinearsolve	4	270	May 29, 2025
Solving linear equations efficiency problem Optimization (Mathematical)	4	3422	June 27, 2019
Looking to parallelize (not parametrize) the solution to a large and highly stiff ODE system Julia at Scale multithreading , ode , differentialequation , sundials , parallel-computing	7	262	July 13, 2024
NonlinearSolve stalls when Jacobian is specified Modelling & Simulations	4	107	March 7, 2025
Simplest way to convert a program for parallel (multithreaded) runs on multiple servers/cores? Julia at Scale	26	1930	February 22, 2021

Would it be possible to use multiple CPU cores in Solving Nonlinear Equations?

Related topics