Here’s the problem: you have a system of linear equations, Ax = b, that is too large to solve with Gaussian elimination or whatever real linear algebra toolboxes use. Instead, it would be nice to have an algorithm that you can run for a certain amount of time, stop, and get a result that is a good approximation. There’s a class of algorithms called “Iterative Methods” which do just that, and I want to explain the idea behind them for my own sake.
The key to iterative methods is splitting A into two parts: D and R (or S and T, or P and P-A, nobody has the same names for these matrices). Now we have the formula: Dx + Rx = b where D + R = A. As a small leap, separate and prefix the x’s in the equation as Dxa + Rxb = b. We can start guessing for values of xb and solve for xa, and maybe try to tweak xb until both our x’s are equivalent, but that seems like an ass-backwards way of solving linear equations. But hell, lets give it a shot make our lives easier by solving for xa:
Dxa + Rxb = b
xa = D-1 (b – Rxb)
(by matrix arithmetic)
Now that we have our formula all
spelled out, we notice that there’s a really computationally inefficient inversion, so whatever our separation of D and R is, we want D to be easily invertible (i.e. a diagonal matrix or triangular).
The key to this algorithm being practical is that there are some matrices A, D, and R which have the property that xa is a better ‘guess’ than xb, and if you iteratively chain operations together, you can get xa’s that approach the real x:
for k=1:n, xk+1 = D-1(b – Rxk), end
What we’re seeing is a convergence in our guesses. The property you’re looking for your matrix selections is that the greatest absolute eigenvalue (or spectral radius) of D-1R is less than 1.
Proof: First I’m going to solve for an error function, e, such that e(k) will tell us how far away xk (x at iteration k) is from the real x by returning x-xk. Then I’ll show that the function converges if D-1R is less than 1.
Original Formula: Dx + Rx = b
Iterative Formula: Dxk+1 + Rxk= b
(Original formula) – (Iterative formula) = D(x– xk+1) + R(x - xk) = 0
so (x – xk+1) = D-1R(x – xk) or ek+1 = D-1Rek. (by algebra + substitution of terms)
e(k) = ek = (D-1R)ke0 (by recursively chaining function calls together)
(D-1R) = QΛQ-1 where Q is a matrix of eigenvectors and Λ is the diagonal matrix of eigenvalues (by eigendecomposition)
(D-1R)k = (QΛQ-1)k
= QΛkQ-1 (by cancellation of Q-1Q)
In this form, it is pretty clear that we want the entries of Λ to be less than zero so the matrices entries will go to zero as k increases.
So to recap, if the spectral radius of (D-1R) is less than 1, we have an efficient way of converging on a correct solution for Ax = b. But how do we select D and R?
Here is where the technique diverges:
Jacobi’s Method: D is the diagonal of A
Gauss-Seidel Method: D is the triangular part of A
Successive Overrelaxation: Combination of Jacobi and Gauss-Seidel.
I’ve included the Matlab code I wrote for Jacobi’s Method:
A = [2 -1; -1 2];
b = [2; 2]; % Solution to this linear equation is [2 2]
S = diag(diag(A)); % Precondition
T = S - A;
S_inv = (1./S); %easy to calculate inverse, that's the point.
S_inv(isinf(S_inv)) = 0;
M = S_inv*T;
curr_x = zeros(2,1); next_x = zeros(2,1);
for k=1:10,
next_x = M*curr_x + M*b;
curr_x = next_x;
end
I was expecting 2,2 and after 10 iterations I got 1.998,1.998. Good stuff! As an aside, you can look at the matrix (D-1R) and tell from its entries (if they’re less than 1) whether or not the matrix will converge. Here's a graph with the residuals (by looking at the (D-1R)'s entries and seeing that they're all 1/2, you can predict the 50% decline).
I also wrote some code for Gauss-Seidel, but I think you’ve got the idea.
No comments:
Post a Comment