I made a break in my effort to understand how Ce Liu’s algorithm works before I start rewriting portions of it. The appendix of his thesis lists a paper that I read named “The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields” by Black and Anandan, and now I understand a lot more code. Here’s my summary:
In optical flow, we can use brightness constancy to say the following:
Error(u,v) = ∑
(I(x,y,t) – I(x+udt, y+ vdt, t+dt))2
(I(x,y,t) – I(x+udt, y+ vdt, t+dt))2
That is, the error E is a function of the displacement vectors u and v, and we can solve this equation to find them. We don’t square the term because it is important to our model; it is largely for convenience sake. With the square, we can take the derivative of the formula and get a linear equation on the right side and solve for when the left side, Error’(u,v), equals 0. Since the equation is convex (one local minimum), we know we’ve minimized the error. This is all pretty straightforward stuff I’ve worked on before.
However, there’s a catch. This method only works if data fits the model 100%. For instance, if brightness doesn’t hold because of depth discontinuity or multiple motions in the same window, then we’ll have an outlying point whose affect will be squared. The problem is similar to means and medians: the mean is sensitive to being skewed by an extreme outlier while the median is not. It would be nice if we had a median-like function that is as easy to solve for u and v with.
ρ: error normalization
function, Ψ: influence function (proportional to derivative of ρ)
function, Ψ: influence function (proportional to derivative of ρ)
Obviously, care should be taken
with choosing an error normalization function after considering issues of
convexity and rates of convergence.
with choosing an error normalization function after considering issues of
convexity and rates of convergence.
In Ce Liu’s (the code I’m trying to
understand), he’s using
understand), he’s using
E(u, v)
=ψ(|I(p+w) − I(p)|2) + αφ(|∇u|2 + |∇v|2)dp
=ψ(|I(p+w) − I(p)|2) + αφ(|∇u|2 + |∇v|2)dp
ψ(x) = φ(x) =√x2 + ε2, (ε=0.001)
Interestingly, both of these are very non-convex, but Liu points out in his thesis that a dense Gaussian pyramid can overcome bad local minimum.
These extra functions have the effect of weighing the continuous gradient constraint (|∇u|2 + |∇v|2 = 0) against the constant brightness constraint (I(p+w) − I(p) = 0). Once all the math is worked out, it looks like each constraint is weighted inversely proportional to the amount of perceived error it has. This makes a lot of sense: suppress one constraint if that part of the model isn’t holding up.
No comments:
Post a Comment