Different inverse for almost equal matrices

Hello,

I’m facing a problem and I do not know if it is due to precision errors or I may be doing something wrong.

I have two square matrices A and B of type Array{Float64,2} with sizes size(A) = (120,120) and size(B) = (120,120). These two matrices are the result of the same computation but computed differently (one way is more efficient than the other). As this computation involves floating point numbers, the difference between A and B is almost zero:

norm(A-B,Inf) = 1.474376176702208e-13

These matrices have positive and negative eigenvalues, and the smallest in magnitude is

minimum(abs.(eigvals(A))) = 7.646361154580633e-7
minimum(abs.(eigvals(B))) = 7.646359724345383e-7

So far so good, but when I compute the inverse of these matrices, the result is completely different:

norm(inv(A)-inv(B),Inf)) = 0.5606764699332416

Why are the inverse matrices different? Is this problem related numerical precision? Is it possible to circumvent this issue?

Thank you in advance!

Don’t compute matrix inverses. Use \ for linear solves instead. That said, if your matrices are ill conditioned, to some extent this will be unavoidable.

2 Likes

Is 0.56 a big or a small difference? You always need to ask compared to what? Here, the relevant comparison is probably to the norm of the inverse, i.e. you want to know the relative error:

norm(inv(A)-inv(B),Inf)) / norm(inv(A),Inf)

What is it?

As @Oscar_Smith writes, the relative error could still be large if your matrices are badly conditioned, and generally you want to avoid computing explicit inverses to begin with. But looking at absolute errors is a common mistake.

3 Likes

I know, but as far as I am concerned, Julia calls \ if the input to ``ìnv``` is a matrix.

Yes, but inv(A)*B will have roughly twice the error and twice the time of A\B

Ok. In my case, norm(inv(A)-\(A,LinearAlgebra.I)) = 0.0, and the same for matrix B.

Right, but when you multiply inv(A) by another matrix, you will accumulate more error, while B\A will have only one set of rounding.

The relative error is:
norm(inv(A)-inv(B),Inf)) / norm(inv(A),Inf) = 1.6565629324513598e-7

In both cases, the maximum eigenvalue is of order 10, and the condition number is of order 1e9. I’m afraid I won’t be able circumvent this, right?

Ok, I understand. Thank you very much. In fact, in my research code I use \ instead of inv but I wanted to be more explicit when posting the question.

this sounds like a fairly controlled level of error given how large your condition number is.

Put another way, think about this in terms of just regular numbers. If I have

x = 1e-10
y = 1e-10 + 1e-18

then

y - x = 1e-18

but

1/x - 1/y ≈ 99.9999

So just because the difference between two numbers is small, does not mean that the difference between their inverses is small.

3 Likes

Whenever you multiply by the matrix or its inverse (or equivalent, e.g. A \ b), you should generically expect to lose about 9 digits to roundoff errors, which is very close to what you are seeing:

julia> eps() * 1e9
2.220446049250313e-7

Not necessarily, but solving an ill-conditioning problem generally means going back to an earlier step, understanding where the ill-conditioning comes from, and re-arranging your computation to avoid it. (That is, you wouldn’t construct your matrix at all, but instead do some equivalent process from earlier inputs. A classic example of this is solving least-squares by QR factorization rather than the normal equations.)

1 Like

That’s because A \ I invokes a specialized method that simply calls inv(A).

2 Likes

I understand. That example is very clear. Thank you.

Aha, thank you.

This matrix comes from the evaluation of a polynomial basis at specific interpolation points (Vandermonde matrix). So I guess I will have to look more carefully to a better conditioned basis.

Thank you very much. I mark your last post as the solution although all the comments were very valuable.

1 Like

SpecialMatrices.jl has a special algorithm for solving Vandermonde systems if I understand their documentation correctly. Maybe it’s worth looking at that.

Writing a polynomial in the monomial basis {1,x,x^2,x^3,\ldots} is notorious for leading to ill-conditioning problems (they give an exponentially growing condition number). It’s much better to use another basis, e.g. a Chebyshev basis (and Chebyshev sampling points if possible).

(If you are just doing polynomial interpolation or regression, see e.g. FastChebInterp.jl or DataInterpolations.jl. For more sophisticated applications of polynomial bases, see ApproxFun.jl.)

2 Likes

More efficient solver algorithms doesn’t change the fact that the matrix is exponentially ill-conditioned. As soon as you use the monomial basis {1,x,x^2,\ldots} you are sunk.

1 Like

You should definitely follow the advice from @stevengj about the general strategy on ill-conditioned problems. Vandermonde matrices with real nodes are ill-conditioned, unless they are very small. Finding yourself in the position of solving that sort of system almost certainly means you made an unfortunate decision in formulating your problem. We all like to reduce things to linear systems, but it’s really very common to take a problem that has a reasonable solution with a reasonable condition number and convert it into a linear algebra problem with a much worse condition number.

With that said, as more of a footnote because I think it’s a neat fact, I’d add that the Bjork-Pereyra algorithm, which appears to be in the SpecialMatrices.jl package has some remarkable stability properties when applied to some Vandermonde systems. In particular, if the nodes are real, nonnegative, and sorted in increasing order and if the right-hand side has alternating signs in its elements, then the solution to a Vandermonde system using the Bjork-Pereyra algorithm computes the solution to high relative accuracy, regardless of the huge condition number of the matrix. This is from an error analysis by N. Higham and is described in his book Accuracy and Stability of Numerical Algorithms (SIAM).

A side effect of that is that if you pick right-hand sides equal to the columns of the identity, they satisfy the alternating sign condition (zero works here for either sign) and you can compute the columns of the inverse to high relative accuracy. So if your nodes are nonnegative and sorted, you actually could get an inverse with high relative accuracy by using this approach. However, you still shouldn’t use it to solve linear systems. Using even an exact inverse (if you could magically get one) to solve a linear system (with rounding errors only in the multiplication by the inverse) is still worse than a backward stable solver like Gaussian elimination with suitable pivoting.

Note that this is not a recommendation, unless perhaps your goal really does involve solving a Vandermonde system with nonnegative, sorted nodes and a right-hand side with alternating signs. But even then, any time you get a badly conditioned matrix, it’s good to look back at how you got the matrix in the first place and whether it’s really necessary.

2 Likes

Thanks @stevengj and @mstewart . I know about the problems of the monomial basis. I use Legendre-Dubiner basis for interpolation in triangular domains.

I was not aware of these packages but I definitely will take a look at them.

@IlianPihlajamaa , as Steven mentioned, I think that ill-conditioning is something inherent to the matrix itself, not the algorithm using to solve the linear. I may misunderstood your words, though.

I think Bernstein-Bezier basis may provide some improvement in terms of stability.

Thank you for your help, even if it is a topic beyond Julia!

2 Likes