<< Chapter < Page Chapter >> Page >
This module introduces and motivates ℓ_1 minimization as a framework for sparse recovery.

As we will see later in this course , there now exist a wide variety of approaches to recover a sparse signal x from a small number of linear measurements. We begin by considering a natural first approach to the problem of sparse recovery.

Given measurements y = Φ x and the knowledge that our original signal x is sparse or compressible , it is natural to attempt to recover x by solving an optimization problem of the form

x ^ = arg min z z 0 subject to z B ( y ) ,

where B ( y ) ensures that x ^ is consistent with the measurements y . Recall that z 0 = | supp ( z ) | simply counts the number of nonzero entries in z , so [link] simply seeks out the sparsest signal consistent with the observed measurements. For example, if our measurements are exact and noise-free, then we can set B ( y ) = { z : Φ z = y } . When the measurements have been contaminated with a small amount of bounded noise, we could instead set B ( y ) = { z : Φ z - y 2 ϵ } . In both cases, [link] finds the sparsest x that is consistent with the measurements y .

Note that in [link] we are inherently assuming that x itself is sparse. In the more common setting where x = Ψ α , we can easily modify the approach and instead consider

α ^ = arg min z z 0 subject to z B ( y )

where B ( y ) = { z : Φ Ψ z = y } or B ( y ) = { z : Φ Ψ z - y 2 ϵ } . By setting Φ ˜ = Φ Ψ we see that [link] and [link] are essentially identical. Moreover, as noted in "Matrices that satisfy the RIP" , in many cases the introduction of Ψ does not significantly complicate the construction of matrices Φ such that Φ ˜ will satisfy the desired properties. Thus, for most of the remainder of this course we will restrict our attention to the case where Ψ = I . It is important to note, however, that this restriction does impose certain limits in our analysis when Ψ is a general dictionary and not an orthonormal basis. For example, in this case x ^ - x 2 = Ψ c ^ - Ψ c 2 α ^ - α 2 , and thus a bound on c ^ - c 2 cannot directly be translated into a bound on x ^ - x , which is often the metric of interest.

Although it is possible to analyze the performance of [link] under the appropriate assumptions on Φ , we do not pursue this strategy since the objective function · 0 is nonconvex, and hence [link] is potentially very difficult to solve. In fact, one can show that for a general matrix Φ , even finding a solution that approximates the true minimum is NP-hard. One avenue for translating this problem into something more tractable is to replace · 0 with its convex approximation · 1 . Specifically, we consider

x ^ = arg min z z 1 subject to z B ( y ) .

Provided that B ( y ) is convex, [link] is computationally feasible. In fact, when B ( y ) = { z : Φ z = y } , the resulting problem can be posed as a linear program  [link] .

Approximation in 1 norm
Approximation in p quasinorm
Best approximation of a point in R 2 by a a one-dimensional subspace using the 1 norm and the p quasinorm with p = 1 2 .

It is clear that replacing [link] with [link] transforms a computationally intractable problem into a tractable one, but it may not be immediately obvious that the solution to [link] will be at all similar to the solution to [link] . However, there are certainly intuitive reasons to expect that the use of 1 minimization will indeed promote sparsity. As an example, recall the example we discussed earlier shown in [link] . In this case the solutions to the 1 minimization problem coincided exactly with the solution to the p minimization problem for any p < 1 , and notably, is sparse. Moreover, the use of 1 minimization to promote or exploit sparsity has a long history, dating back at least to the work of Beurling on Fourier transform extrapolation from partial observations  [link] .

Additionally, in a somewhat different context, in 1965 Logan  [link] showed that a bandlimited signal can be perfectly recovered in the presence of arbitrary corruptions on a small interval. Again, the recovery method consists of searching for the bandlimited signal that is closest to the observed signal in the 1 norm. This can be viewed as further validation of the intuition gained from [link] — the 1 norm is well-suited to sparse errors.

Historically, the use of 1 minimization on large problems finally became practical with the explosion of computing power in the late 1970's and early 1980's. In one of its first applications, it was demonstrated that geophysical signals consisting of spike trains could be recovered from only the high-frequency components of these signals by exploiting 1 minimization  [link] , [link] , [link] . Finally, in the 1990's there was renewed interest in these approaches within the signal processing community for the purpose of finding sparse approximations to signals and images when represented in overcomplete dictionaries or unions of bases  [link] , [link] . Separately, 1 minimization received significant attention in the statistics literature as a method for variable selection in linear regression , known as the Lasso  [link] .

Thus, there are a variety of reasons to suspect that 1 minimization will provide an accurate method for sparse signal recovery. More importantly, this also provides a computationally tractable approach to the sparse signal recovery problem. We now provide an overview of 1 minimization in both the noise-free and noisy settings from a theoretical perspective. We will then further discuss algorithms for performing 1 minimization later in this course .

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, An introduction to compressive sensing. OpenStax CNX. Apr 02, 2011 Download for free at http://legacy.cnx.org/content/col11133/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'An introduction to compressive sensing' conversation and receive update notifications?

Ask