Solving Linear Systems with Iterative Methods#

Teng-Jui Lin

Content adapted from UW AMATH 301, Beginning Scientific Computing, in Spring 2020.

  • Solving linear systems with iteration methods

    • Jocabi method

    • Gauss-Seidel method

  • Convergence criteria of the methods

Iteration methods are useful to solve for systems with large and sparse matrices.

Jacobi method#

Goal: solve for a linear system \(\mathbf{Ax=b}\).

We split the matrix \(\mathbf{A}\) into a diagonal matrix \(\mathbf{D}\) and a matrix of the rest \(\mathbf{T}\):

\[ \mathbf{A = D+T} \]

so that

\[\begin{split} \begin{aligned} \mathbf{Ax} &= \mathbf{b} \\ \mathbf{(D+T)x} &= \mathbf{b} \\ \mathbf{Dx+Tx} &= \mathbf{b} \\ \mathbf{Dx} &= \mathbf{-Tx+b} \\ \end{aligned} \end{split}\]

for exact solutions. Adding subscripts for numerical iteration, we have

\[\begin{split} \begin{aligned} \mathbf{Dx}_{k+1} &= \mathbf{-Tx}_{k}+\mathbf{b} \\ \mathbf{x}_{k+1} &= \mathbf{D}^{-1}(\mathbf{-Tx}_{k}+\mathbf{b}) \\ \mathbf{x}_{k+1} &= -\mathbf{D}^{-1}\mathbf{Tx}_{k}+\mathbf{D}^{-1}\mathbf{b} \end{aligned} \end{split}\]

We can write this as

\[ \mathbf{x}_{k+1} = \mathbf{Mx}_{k}+\mathbf{c} \]

where \(\mathbf{M} = -\mathbf{D}^{-1}\mathbf{T}\) is the iteration matrix, and \(\mathbf{c=D}^{-1}\mathbf{b}\). Note that the inverse of a diagonal matrix \(\mathcal{O}(n)\).

We calculate \(\mathbf{T}\) by \(\mathbf{A = D+T \implies T=A-D}\).

Implementation#

Problem Statement. Use Jacobi method to solve the linear system \(\mathbf{Ax=b}\) where

\[\begin{split} \mathbf{A} = \begin{bmatrix} 7 & 1 & 1 & 1 \\ 0 & 15 & 2 & 9 \\ 3 & 3 & 20 & 0 \\ 1 & -1 & -2 & -12 \\ \end{bmatrix}, \mathbf{x} = \begin{bmatrix} a \\ b \\ c \\ d \\ \end{bmatrix}, \mathbf{b} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ \end{bmatrix} \end{split}\]
import numpy as np
def jacobi(A, b, x0, tolerance=1e-6, max_iter=50000):
    '''
    Solves linear system Ax=b using Jacobi iteration method
    
    :param A: matrix of the system
    :param b: column vector of the right hand side
    :param x0: initial guess
    :param tolerance: tolerance for stopping criteria
    :param max_iter: maximum iterations allowed for calculation
    :returns: final solution to Ax=b
    '''
    import numpy as np
    import scipy
    
    D = np.diag(np.diag(A))
    T = A - D
    M = -np.linalg.solve(D, T)
    c = np.linalg.solve(D, b)
    
    x_old = x0
    i = 0  # iteration counter
    change = tolerance * 2  # stopping criteria
    
    # jacobi method logic
    while change > tolerance and i < max_iter:
        x_new = M@x_old + c
        change = np.linalg.norm(x_new - x_old, np.inf)
        x_old = x_new
        i += 1
        
    # maximum iteration warning
    if i == max_iter:
        import warnings
        warnings.warn(f'Maximum iteration reached. Current stopping criteria is {change}', UserWarning)
        
    return x_new
A = np.array([[7, 1, 1, 1], [0, 15, 2, 9], [3, 3, 20, 0], [1, -1, -2, -12]])
b = np.array([1, -3, 12, 2])
x0 = np.array([0, 0, 0, 0])
jacobi(A, b, x0)
array([ 0.11085635, -0.1322629 ,  0.60321108, -0.24694186])
# compare with numpy solve
np.linalg.solve(A, b)
array([ 0.11085627, -0.132263  ,  0.60321101, -0.2469419 ])

Gauss-Seidel method#

Goal: solve for a linear system \(\mathbf{Ax=b}\).

We split the matrix \(\mathbf{A}\) into a lower triangular matrix \(\mathbf{S}\) and a matrix of the rest (strictly upper triangular matrix) \(\mathbf{T}\):

\[ \mathbf{A = S+T} \]

so that

\[\begin{split} \begin{aligned} \mathbf{Ax} &= \mathbf{b} \\ \mathbf{(S+T)x} &= \mathbf{b} \\ \mathbf{Sx+Tx} &= \mathbf{b} \\ \mathbf{Sx} &= \mathbf{-Tx+b} \\ \end{aligned} \end{split}\]

for exact solutions. Adding subscripts for numerical iteration, we have

\[\begin{split} \begin{aligned} \mathbf{Sx}_{k+1} &= \mathbf{-Tx}_{k}+\mathbf{b} \\ \mathbf{x}_{k+1} &= \mathbf{S}^{-1}(\mathbf{-Tx}_{k}+\mathbf{b}) \\ \mathbf{x}_{k+1} &= -\mathbf{S}^{-1}\mathbf{Tx}_{k}+\mathbf{S}^{-1}\mathbf{b} \end{aligned} \end{split}\]

We can write this as

\[ \mathbf{x}_{k+1} = \mathbf{Mx}_{k}+\mathbf{c} \]

where \(\mathbf{M} = -\mathbf{S}^{-1}\mathbf{T}\) is the iteration matrix, and \(\mathbf{c=S}^{-1}\mathbf{b}\).

We calculate \(\mathbf{T}\) by \(\mathbf{A = S+T \implies T=A-S}\).

Implementation#

Problem Statement. Use Gauss-Seidel method to solve the linear system \(\mathbf{Ax=b}\) where

\[\begin{split} \mathbf{A} = \begin{bmatrix} 7 & 1 & 1 & 1 \\ 0 & 15 & 2 & 9 \\ 3 & 3 & 20 & 0 \\ 1 & -1 & -2 & -12 \\ \end{bmatrix}, \mathbf{x} = \begin{bmatrix} a \\ b \\ c \\ d \\ \end{bmatrix}, \mathbf{b} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ \end{bmatrix} \end{split}\]
def gauss_seidel(A, b, x0, tolerance=1e-6, max_iter=50000):
    '''
    Solves linear system Ax=b using Gauss-Seidel iteration method
    
    :param A: matrix of the system
    :param b: column vector of the right hand side
    :param x0: initial guess
    :param tolerance: tolerance for stopping criteria
    :param max_iter: maximum iterations allowed for calculation
    :returns: final solution to Ax=b
    '''
    import numpy as np
    import scipy
    
    S = np.tril(A)
    T = A - S
    M = -np.linalg.solve(S, T)
    c = np.linalg.solve(S, b)
    
    x_old = x0
    i = 0  # iteration counter
    change = tolerance * 2  # stopping criteria
    
    # gauss seidel method logic
    while change > tolerance and i < max_iter:
        x_new = M@x_old + c
        change = np.linalg.norm(x_new - x_old, np.inf)
        x_old = x_new
        i += 1
    
    # maximum iteration warning
    if i == max_iter:
        import warnings
        warnings.warn(f'Maximum iteration reached. Current stopping criteria is {change}', UserWarning)
    
    return x_new
A = np.array([[7, 1, 1, 1], [0, 15, 2, 9], [3, 3, 20, 0], [1, -1, -2, -12]])
b = np.array([1, -3, 12, 2])
x0 = np.array([0, 0, 0, 0])
gauss_seidel(A, b, x0)
array([ 0.11085636, -0.13226307,  0.60321101, -0.24694188])
# compare with numpy solve
np.linalg.solve(A, b)
array([ 0.11085627, -0.132263  ,  0.60321101, -0.2469419 ])

Convergence of iteration methods#

Strict diagonal dominance (SDD)#

A matrix is strictly diagonal dominant when the absolute value of its diagonal element in each row is greater than the sum of the absolute values of the off-diagonal elements.

SDD is a weak convergence criteria. If the matrix \(\mathbf{A}\) is SDD, the methods will converge, but if the matrix is not SDD, it’s inconclusive.

Eigenvalues of iteration matrix \(\mathbf{M}\)#

In iteration methods, we have

\[ \mathbf{x}_{k+1} = \mathbf{Mx}_{k}+\mathbf{c}, \]

but for exact solution, we have

\[ \mathbf{x}= \mathbf{Mx}+\mathbf{c}. \]

Subtract thee equations, we get that the error is related to the iteration matrix:

\[\begin{split} \begin{aligned} \mathbf{x-x}_{k+1} &= \mathbf{M(x-x}_{k})+\mathbf{c} \\ \varepsilon_{k+1} &= \mathbf{M}\varepsilon_{k} \end{aligned} \end{split}\]

In general, the iteration matrix \(\mathbf{M}\) determines if the error vector grows or decays:

\[ \varepsilon_{k} = \mathbf{M}^k \varepsilon_{0} \]

Convergence criteria:

  1. If all the eigenvalues of \(\mathbf{M}\) satisfy \(|\lambda_i|<1\), the method converges

  2. If any of the eigenvalues of \(\mathbf{M}\) satisfy \(|\lambda_i|>1\), the method does not converge

  3. If the eigenvalues of \(\mathbf{M}\) are closer to 0, the method will converge faster

Usually, the eigenvalues of the iteration matrix for the Gauss-Seidel method are smaller than the eigenvalues of the iteration matrix for the Jacobi method (but not always).

Implementation#

Problem Statement. Determine if Jacobi and Gauss-Seidel method will converge for solving the linear system \(\mathbf{Ax=b}\) where

\[\begin{split} \mathbf{A} = \begin{bmatrix} 7 & 1 & 1 & 1 \\ 0 & 15 & 2 & 9 \\ 3 & 3 & 20 & 0 \\ 1 & -1 & -2 & -12 \\ \end{bmatrix}, \mathbf{x} = \begin{bmatrix} a \\ b \\ c \\ d \\ \end{bmatrix}, \mathbf{b} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ \end{bmatrix} \end{split}\]

Compare their speed of convergence.

# convergence of jacobi method
A = np.array([[7, 1, 1, 1], [0, 15, 2, 9], [3, 3, 20, 0], [1, -1, -2, -12]])
D = np.diag(np.diag(A))
T = A - D
M = -np.linalg.solve(D, T)
lambdas = np.linalg.eigvals(M)
max_lambda = max(abs(lambdas))
max_lambda
0.3520564053849318
# convergence of gauss-seidel method
A = np.array([[7, 1, 1, 1], [0, 15, 2, 9], [3, 3, 20, 0], [1, -1, -2, -12]])
S = np.tril(A)
T = A - S
M = -np.linalg.solve(S, T)
lambdas = np.linalg.eigvals(M)
max_lambda = max(abs(lambdas))
max_lambda
0.10591293846976324

In this case, both iterative methods converge. Gauss-Seidel method has smaller maximum eigenvalue, so it converges faster than Jacobi method.