Tyler's Blog: Iterative Approaches to Solving Large System of Equations

Here’s the problem: you have a system of linear equations, Ax = b, that is too large to solve with Gaussian elimination or whatever real linear algebra toolboxes use. Instead, it would be nice to have an algorithm that you can run for a certain amount of time, stop, and get a result that is a good approximation. There’s a class of algorithms called “Iterative Methods” which do just that, and I want to explain the idea behind them for my own sake.

The key to iterative methods is splitting A into two parts: D and R (or S and T, or P and P-A, nobody has the same names for these matrices). Now we have the formula: Dx + Rx = b where D + R = A. As a small leap, separate and prefix the x’s in the equation as Dx_a + Rx_b = b. We can start guessing for values of x_b and solve for x_a, and maybe try to tweak x_b until both our x’s are equivalent, but that seems like an ass-backwards way of solving linear equations. But hell, lets give it a shot make our lives easier by solving for x_a:

Dx_a + Rx_b = b

x_a = D^-1 (b – Rx_b)
(by matrix arithmetic)

Now that we have our formula all
spelled out, we notice that there’s a really computationally inefficient inversion, so whatever our separation of D and R is, we want D to be easily invertible (i.e. a diagonal matrix or triangular).

The key to this algorithm being practical is that there are some matrices A, D, and R which have the property that x_a is a better ‘guess’ than x_b, and if you iteratively chain operations together, you can get x_a’s that approach the real x:

for k=1:n, x_k+1 = D^-1(b – Rx_k), end

What we’re seeing is a convergence in our guesses. The property you’re looking for your matrix selections is that the greatest absolute eigenvalue (or spectral radius) of D^-1R is less than 1.

Proof: First I’m going to solve for an error function, e, such that e(k) will tell us how far away x_k (x at iteration k) is from the real x by returning x-x_k. Then I’ll show that the function converges if D^-1R is less than 1.

Original Formula: Dx + Rx = b

Iterative Formula: Dx_k+1 + Rx_k= b

(Original formula) – (Iterative formula) = D(x– x_k+1) + R(x - x_k) = 0

so (x – x_k+1) = D^-1R(x – x_k) or e_k+1 = D^-1Re_k. (by algebra + substitution of terms)

e(k) = e_k = (D^-1R)^ke₀ (by recursively chaining function calls together)

(D^-1R) = QΛQ^-1 where Q is a matrix of eigenvectors and Λ is the diagonal matrix of eigenvalues (by eigendecomposition)

(D^-1R)^k = (QΛQ^-1)^k= QΛ^kQ^-1 (by cancellation of Q^-1Q)

In this form, it is pretty clear that we want the entries of Λ to be less than zero so the matrices entries will go to zero as k increases.

So to recap, if the spectral radius of (D^-1R) is less than 1, we have an efficient way of converging on a correct solution for Ax = b. But how do we select D and R?

Here is where the technique diverges:

Jacobi’s Method: D is the diagonal of A

Gauss-Seidel Method: D is the triangular part of A

Successive Overrelaxation: Combination of Jacobi and Gauss-Seidel.

I’ve included the Matlab code I wrote for Jacobi’s Method:

A = [2 -1; -1 2];

b = [2; 2]; % Solution to this linear equation is [2 2]

S = diag(diag(A)); % Precondition

T = S - A;

S_inv = (1./S); %easy to calculate inverse, that's the point.

S_inv(isinf(S_inv)) = 0;

M = S_inv*T;

curr_x = zeros(2,1); next_x = zeros(2,1);

for k=1:10,

next_x = M*curr_x + M*b;

curr_x = next_x;

end

I was expecting 2,2 and after 10 iterations I got 1.998,1.998. Good stuff! As an aside, you can look at the matrix (D^-1R) and tell from its entries (if they’re less than 1) whether or not the matrix will converge. Here's a graph with the residuals (by looking at the (D^-1R)'s entries and seeing that they're all 1/2, you can predict the 50% decline).

I also wrote some code for Gauss-Seidel, but I think you’ve got the idea.

Tyler's Blog

Tuesday, May 17, 2011

Iterative Approaches to Solving Large System of Equations

No comments:

Post a Comment