Criteria of Numerical Algorithms¶

There are two major tasks in numerical linear algebra:

Solving linear systems
Computing eigenvalues and eigenvectors

As this course is an introductory course of general computation, we will not cover the details of algorithms solving these two problems. Instead, we will focus on the practical implementations. We suggest the readers to refer to the textbooks on numerical linear algebra for more details.

Complexity of Matrix Operations¶

We discussed the two important standards for a good algorithm in the Complexity Analysis:

Time complexity: \(A\) is \(\ell\) by \(m\) matrix, \(B\) is \(m\) by \(n\) matrix, the time complexity of \(AB\) is \(O(\ell m n)\).
Space complexity: Storage of \(A\) and \(B\) requires \(O(\ell m + m n)\) space.

Actually, the time complexity of multiplying two \(n\) by \(n\) matrices can be faster than \(O(n^3)\). Let \(\omega\) be the exponent of the time complexity \(O(n^\omega)\) of multiplying two \(n\) by \(n\) matrices. The naive range is \(2 \le \omega \le 3\). It is still an open problem to find the minimum \(\omega\). Till 2024, the best \(\omega\) is 2.371339. We refer the readers to the Wikipedia page for more details.

Sensitivity and Stability¶

Besides the time complexity and space complexity, we also need to consider the following two standards for numerical algorithms:

Sensitivity: How much the numerical problem is affected by the small perturbations in the input data
Stability: When running the algorithm, how do the roundoff (and other) errors affect the accuracy of the computed solution.

Notice that the sensitivity is fully determined by the problem itself and independent of the algorithm to solve it, while the stability is determined by the algorithm.

Sensitivity Analysis¶

Definition(Sensitivity). We want to compute \(f(x)\) for a given \(x\). If \(x\) is perturbed by \(\delta x\), the output is \(f(x + \delta x)\). Whyen \(|\delta x|/|x|\) is small, we want to find a constant \(c(x)\) as small as possible such that

\[ |f(x + \delta x) - f(x)| \le c(x) \frac{|\delta x|}{|x|}. \]

We call \(c(x)\) the condition number of the problem. We say the problem is well-conditioned if \(c(x)\) is bounded, and ill-conditioned if \(c(x)\) is large.

We can see that when \(f(x)\) is differentiable

\[ c(x) \approx \frac{|f'(x)||x|}{|f(x)|}. \]

For complex problems, it is not easy to compute the condition number. We will discuss the condition number of linear algebra problems in the following sections.

Backward Error Analysis¶

We all know that computers using the floating point arithmetic. For example, suppose the computer can only store 2 digits after the decimal point. When we compute \(\pi + e\), the computer actually computes float\((\pi + e) = 3.14 + 2.72 = 5.86\) with a roundoff error \(\epsilon \approx 0.00012\). Therefore, it is important to analyze how the roundoff errors affect the accuracy of algorithms. Even if a numerical problem is well-conditioned, a bad algorithm can still lead to a large error.

Definition(Backward Error Analysis). Suppose \(\hat{f}\) is the algorithm to compute \(f\). Suppose there exists a \(\delta x\) such that

\[ \hat{f}(x) = f(x + \delta x), \quad |\delta x| \le \epsilon |x|, \]

where \(\epsilon\) relates to the machine precision and the problem. We say the algorithm is numerically stable if \(\epsilon\) is small.

Combining the sensitivity analysis and the backward error analysis, we can have the overall error bound of the result:

\[ \frac{|\hat{f}(x) - f(x)|}{|f(x)|} = \frac{|f(x + \delta x) - f(x)|}{|f(x)|} \le c(x) \frac{|\delta x|}{|x|} \le c(x) \epsilon. \]

We can get a reliable result when we apply a numerically stable algorithm to a well-conditioned problem.

Here is a more concrete example of stable and unstable algorithms.

Example.(Stability of square root) Many algorithms compute \(\sqrt{2}\) by starting with an initial approximation \(x_0\) to \(\sqrt{2}\), for instance \(x_0 = 1.4\), and then computing improved guesses \(x_1, x_2\), etc.

One such method is the famous Newton-Raphson method. To solve \(f(x) = 0\), we start with an initial guess \(x_0\) and update it by:

\[ x_{k+1} = x_k - \frac{f(x_k)}{f'(x_k)}. \]

For \(f(x) = x^2 - 2\), we have

\[ x_{k+1} = \frac{x_k + \frac{2}{x_k}}{2} \]

Another method, called "method X", is given by:

\[ x_{k+1} = (x_k^2 - 2)^2 + x_k, \]

where \(\sqrt{2}\) is the fixed point.

A few iterations of each method are calculated in plots below, with initial guesses \(x_0 = 1.4\) and \(x_0 = 1.42\).

Iteration	Newton (x₀=1.4)	Newton(x₀=1.42)	Method X (x₀=1.4)	Method X (x₀=1.42)
x₀	1.4	1.42	1.4	1.42
x₁	1.4142857...	1.41422535...	1.4016	1.42026896
x₂	1.414213564...	1.41421356242...	1.4028614...	1.42056...
...	...	...	...	...
Final	x₁,₀₀₀,₀₀₀ = 1.41421...	-	x₂₇ = 7280.2284...	-

You can see why the stability of two methods are different by the plot below.

Tip: Be careful with multiplication and division involving vastly different magnitudes. Be careful with subtraction of two nearly identical numbers (this is called catastrophic cancellation).

Example.(Catastrophic cancellation) The loss of significance can be demonstrated using the following two mathematically equivalent functions:

\[ f(x) = x\left(\sqrt{x+1} - \sqrt{x} \right), \quad g(x) = \frac{x}{\sqrt{x+1} + \sqrt{x}}. \]

Now, evaluating these functions at \(x = 500\): The true value, computed with infinite precision, is \(11.174755\ldots\). If a system is limited to storing only the four most significant decimal digits, we have

\[ f(500) = 500\left(\sqrt{501} - \sqrt{500}\right) = 500\left(22.38 - 22.36\right) = 500(0.02) = 10 \]

\[ g(500) = \frac{500}{\sqrt{501} + \sqrt{500}} = \frac{500}{22.38 + 22.36} = \frac{500}{44.74} = 11.17. \]

Comparing the computed values, it is evident that loss of significance occurs. This is due to catastrophic cancellation when subtracting approximations of two nearly identical values, \(\sqrt{501}\) and \(\sqrt{500}\), even though the subtraction is performed exactly.