Before giving the long answer, the short answer is that x probably had large norm, that you got a surprisingly small residual for your problem, and that you can’t really expect to do any better by simply switching algorithms for just solving the system.
In fact, assuming you used double precision, that’s small enough that it makes me think you had a really small problem and got lucky or you had some really special problem structure or there is some mistake in computing the relative residual. From the limited information I have, I don’t know if you should feel lucky or suspicious. If it were me, I always start with suspicion and would double check:
norm(b - A * x, Inf)
opnorm(A, Inf)
norm(x, Inf)
If you mixed norms and did something like use the Frobenius norm of A (i.e. norm(A)
) on a larger problem, that might give an unexpectedly small result. That’s the only mistake I can think of that would be easy to make, in part because Matlab doesn’t give the Frobenius norm for norm(A)
, which of course will trip up anyone coming from Matlab. If for those values
norm(b-A*x, Inf) / (opnorm(A, Inf) * norm(x, Inf))
is really about 10^{-24}, I suppose you can feel lucky, although seeing something like that unexplained bugs me. If it’s closer to 10^{-16} you can still feel very satisfied.
The reason for all this is that there is a theorem that says that there exists a matrix \Delta A with \|\Delta A\| / \|A\| equal to the relative residual for which your computed solution \hat{x} satisfies
(A + \Delta A) \hat{x} = b
exactly. If the relative residual is close to the unit roundoff (about 10^{-16} in double precision) then this means that A is close to a matrix for which your computed solution is the exact solution. Since \Delta A is comparable to the unit roundoff, in perturbing A with \Delta A you don’t introduce any errors that are expected to be any worse than were already present because whatever your exact matrix A should be, the elements of the stored A must presumably have been rounded to fit in a Float64
. The matrix \Delta A is called a backward error, and having a relative backward error that is close to the unit roundoff is generally interpreted as the best you can do. (Although there are stronger componentwise versions of this that you might try for and more specialized forward error results for some problems.)
For a given backward error, if you define
\delta = \frac{\|\Delta A\|}{\|A\|} = \frac{\|b - A\hat{x}\|}{\|A\| \|\hat{x}\|}
and the condition number \kappa(A) = \|A\| \|A^{-1}\| using the same matrix norm, then the standard perturbation bound says that you should expect your computed solution to satisfy
\frac{\| x - \hat{x}\|}{\|\hat{x}\|} \leq \kappa(A) \delta
That might still be horrible relative accuracy in \hat{x}, but if \delta is small it’s inherent to your problem and more an issue with the condition number, not the algorithm. If you are unhappy with \hat{x} at that point your recourse is to consider whether you can change the problem, perhaps by regularizing it but also perhaps by taking a completely different approach to whatever larger problem you are trying to solve. Ill-conditioning is sometimes an artificial result from taking a bad choice of basis (or bad choice of something else) that results in an ill-conditioned matrix later. However anything you do to change your problem is going to be highly specific to what you are trying to do.