Finding roots of multivariate function with given bounds?

Suppose I have a multivariate function, f(x, y), with x \in (0, 1) and y \in (0, 1). I want to find the roots of following equation

f(x, y) - c = 0

where c = [c_1\ c_2]^T is a constant vector. In my case, there are log functions in f.

I have tried NLsolve. It can find correct root sometimes, but it strongly depends on how close the initial guess to the root. If a bad initial guess is given, the log function will complain for receiving a negative value. I wonder if I can provide the bounds for x and y to NLsolve?

I have also tried NonlinearSolve with default settings. It always fails.

Can someone provide me some guide on how to find roots of such system? BTW, the analytical expression of f may not know and sometimes extremely costly to compute.

MWE below

using StaticArrays

# a typical set of parameters
param = (1.0, 0.2, 0.01, 10.0, 40.0, 40.0)

# the objective function
f = (x, y) -> μ(x, y, param)

# an example of c can be obtained from below
c = f(0.2, 0.4)

# try to find the roots of
g = (x, y) -> f(x, y) - c  # should give at least one root which is [0.2, 0.4]. Other roots are possible.

function γA(ϕA, ϕB, param)
	αA, αB, αS, χABN, χASN, χBSN = param
	ϕS = 1 - ϕA - ϕB
	return (1+log(ϕA))/αA + χABN*ϕB + χASN*ϕS
end

function γB(ϕA, ϕB, param)
	αA, αB, αS, χABN, χASN, χBSN = param
	ϕS = 1 - ϕA - ϕB
	return (1+log(ϕB))/αB + χABN*ϕA + χBSN*ϕS
end

function γS(ϕA, ϕB, param)
	αA, αB, αS, χABN, χASN, χBSN = param
	ϕS = 1 - ϕA - ϕB
	return (1+log(ϕS))/αS + χASN*ϕA + χBSN*ϕB
end

function μA(ϕA, ϕB, param)
	return γA(ϕA, ϕB, param) - γS(ϕA, ϕB, param)
end

function μB(ϕA, ϕB, param)
	return γB(ϕA, ϕB, param) - γS(ϕA, ϕB, param)
end

function μ(ϕA, ϕB, param)
	return SVector(μA(ϕA, ϕB, param), μB(ϕA, ϕB, param))
end

You could minimize (f(x,y) - c)^2 with a method that accepts box constraints. I know that Ipopt and SPGBox do.

Yet, without the analytical expression for the function, and being costly, the options using finite difference derivatives become less interesting.

2 Likes

There are global black box optimizers specialized for expensive objective functions, and which also accommodate portions of the domain where the function is undefined. See, e.g. NOMAD.jl

1 Like

I have encountered several times in this forum and other places that it is not a good idea to convert a root finding problem to a optimization problem. So is this the best choice?

1 Like

Thanks! I will look into it.

You could use the optimizer to get a good starting point for the root finder.

Use Optim.jl is what I would do but I am not sure if it’s the best in the Julia ecosystem.

using Optim

lower = [0.0+eps(), 0.0+eps()]
upper = [1-eps(), 1-eps()]

function loss(xy)
    sum((f(xy...) .- c).^2)
end

opt = optimize(loss, lower, upper, [0.5, 0.5], Fminbox(LBFGS()))

:astonished: I am not aware of the technical difficulties. I’ve never had to worry about it. But my problems are simple

2 Likes

Go for IntervalRootFinding.jl. It uses interval arithmetic and is able to separate and bound all the roots, while being robust to roundoff errors.

7 Likes

Why don’t you post your Julia code for a MWE too?

Thanks! It works great with a simple example. I am working on testing it with my actual cases.

1 Like

Sure. See my updated main post.

If you turn your zero finding problem into an optimization problem like someone suggested in this thread, you should consider NLopt.jl. It has many algorithms to solve nonlinear optimization problems with or without derivatives.

One example is discussed at https://scicomp.stackexchange.com/questions/2444/newton-based-methods-in-optimization-vs-solving-systems-of-nonlinear-equations.

But another is the simple set of linear equations written in the vector form as Ax=b. Even if a solution is guaranteed to exist (for example if A is nonsingular), nobody can prevent us from reformulating the problem of solving this equation into a problem of minimizing the 2-norm of the residuum r=Ax-b (while not particularly exploiting the knowledge that the minimum value is 0).

The first-order necessary condition of optimality is given by the normal equation(s) A^\top A x = A^\top b.

This (standard) way we reformulated the original Ax=b problem into the A^\top A x = A^\top b problem. If the original matrix A was ill-conditioned, the new matrix A^\top A defining the linear equation(s) will be even more so.

julia> using LinearAlgebra: cond

julia> A = rand(100,100);

julia> b = rand(100);

julia> x_leq = A\b;

julia> cond(A)
1084.9088513323275

julia> x_lsq = (A'*A)\A'*b;

julia> cond(A'*A)
1.1770272157017153e6

julia> norm(x_leq - x_lsq)
3.1148469524106095e-12

In contrast, the intimate relationship between solving a set of linear equation and minimizing a quadratic function is exploited in the conjugate gradient method. Related to the Ax=b equation is the minimization of \frac{1}{2}x^\top Ax-b^\top x. But this is certainly not the same as \|Ax-b\|^2.

1 Like

Good point (provided that the quadratic is strictly convex, that is A is positive-definite).

1 Like

Maybe be you mean find all extrema not just the minimum.
And it will miss the points where the second derivative is zero.