Linear Equation Algorithms¶
We want to solve the linear equation
Basic Facts of Linear Equation Solvers¶
Time Complexity. The time complexity of solving a linear equation is generally
Sensitivity. The sensitivity of solving a linear equation depends on the condition number of the matrix
Definition(Condition Number of Matrix). The condition number of a matrix
Theorem(Sensitivity of Linear Equation Solver). Suppose
then the following holds:
Algorithms to Solve Linear Equations¶
We will only give a high-level overview of the algorithms to solve linear equations. The linear solvers are so well-developed now that for the most time, it is efficient enough to use the built-in functions in the programming languages.
The most important philosophy in designing a numerical linear algebra algorithm is to reduce the general problem into several simpler problems. And the most common way to do this is to use different kinds of matrix decompositions. For linear equations, it is easy to solve a linear equation
LU Decomposition numpy.linalg.solve
or scipy.linalg.solve
function solves the linear equation by this method.
Cholesky Decomposition
The LU decomposition needs approximately
However, you do not need to implement the Cholesky decomposition by yourself. The linalg.solve
function
has a built-in assume_a
parameter:
- assume_a = 'gen'
- General matrix (default)
- assume_a = 'sym'
- Symmetric matrix
- assume_a = 'pos'
- Symmetric positive definite
The following code compares the running time of the standard linalg.solve
function and the Cholesky decomposition method.
import numpy as np
from scipy import linalg
import time
# Create a symmetric positive definite matrix
n = 5000
np.random.seed(0)
X = np.random.rand(n, n)
A = np.dot(X, X.T) # Also a sample covariance matrix
b = np.random.rand(n)
# Time np.linalg.solve()
start_time = time.time()
x1 = linalg.solve(A, b)
linalg_time = time.time() - start_time
# Use positive definite argument
start_time = time.time()
x3 = linalg.solve(A, b, assume_a='pos')
solve_pos_time = time.time() - start_time
# Use symmetric argument
start_time = time.time()
x4 = linalg.solve(A, b, assume_a='sym')
solve_sym_time = time.time() - start_time
# Manual Cholesky decomposition
start_time = time.time()
L = linalg.cholesky(A)
y = linalg.solve_triangular(L, b, lower=True) # Forward substitution
x2 = linalg.solve_triangular(L.T, y, lower=False) # Backward substitution
cholesky_time = time.time() - start_time
# Print results
print(f'General solve Time: {linalg_time:.6f} seconds') # 0.669566 seconds
print(f'Symmetric solve Time: {solve_sym_time:.6f} seconds') # 1.184631 seconds
print(f'Symmetric positive definite solve Time: {solve_pos_time:.6f} seconds') # 0.468433 seconds
print(f'Manual Cholesky Decomposition Time: {cholesky_time:.6f} seconds') # 0.369518 seconds
The manual Cholesky decomposition is slightly faster than the linalg.solve
function with the assume_a='pos'
argument. For most time, we do not need to implement the Cholesky decomposition by ourselves. You can refer to the scipy documentation for more details.
Note
Both scipy
and numpy
have the linalg
module. However, the scipy.linalg
module has more functions like lu
, cholesky
, solve_triangular
, etc.
Tips for Solving Linear Equations¶
Tip 1. Never use inv(A) @ B
. Always use solve(A, B)
. Most of the time inv(A)
is not numerically stable.
Tip 2. Even if you need to solve the linear equation inv(A)
. Instead, you should save the LU decomposition of
import numpy as np
from scipy.linalg import lu, solve_triangular
A = np.random.rand(100, 100)
b = np.random.rand(100)
# Perform LU decomposition with partial pivoting
P, L, U = lu(A)
# Solve Ly = Pb using forward substitution
y = solve_triangular(L, P@b, lower=True)
x = solve_triangular(U, y)
Tip 3. When
Woodbury Matrix Identity.
If
For example, when solving a ridge regression
we can use the Woodbury matrix identity to speed up the computation when the dimension is much larger than the sample size.
Stability of LU Decomposition¶
Sometimes we may say the LU decomposition of linalg.lu
will return three matrices P, L, U = lu(A)
. It applies the Gaussian elimination with partial pivoting to get a matrix factorization
For example, consider what happens when we factor the following matrix without pivoting:
When
that is, a rounding error in the (huge)
Notice that the condition number of
To avoid the numerical instability, we should swap the rows of
Even if we round off
This idea leads to the algorithm of Gaussian elimination with partial pivoting, which we will not discuss in detail. You can read the textbook on numerical linear algebra for more details.