<< Chapter < Page Chapter >> Page >
This module establishes concentration bounds for sub-Gaussian vectors and matrices.

Sub-Gaussian distributions have a close relationship to the concentration of measure phenomenon [link] . To illustrate this, we note that we can combine Lemma 2 and Theorem 1 from "Sub-Gaussian random variables" to obtain deviation bounds for weighted sums of sub-Gaussian random variables. For our purposes, however, it will be more enlightening to study the norm of a vector of sub-Gaussian random variables. In particular, if X is a vector where each X i is i.i.d. with X i Sub ( c ) , then we would like to know how X 2 deviates from its mean.

In order to establish the result, we will make use of Markov's inequality for nonnegative random variables.

(markov's inequality)

For any nonnegative random variable X and t > 0 ,

P ( X t ) E ( X ) t .

Let f ( x ) denote the probability density function for X .

E ( X ) = 0 x f ( x ) d x t x f ( x ) d x t t f ( x ) d x = t P ( X t ) .

In addition, we will require the following bound on the exponential moment of a sub-Gaussian random variable.

Suppose X Sub ( c 2 ) . Then

E exp ( λ X 2 / 2 c 2 ) 1 1 - λ ,

for any λ [ 0 , 1 ) .

First, observe that if λ = 0 , then the lemma holds trivially. Thus, suppose that λ ( 0 , 1 ) . Let f ( x ) denote the probability density function for X . Since X is sub-Gaussian, we have by definition that

- exp ( t x ) f ( x ) d x exp ( c 2 t 2 / 2 )

for any t R . If we multiply by exp ( - c 2 t 2 / 2 λ ) , then we obtain

- exp ( t x - c 2 t 2 / 2 λ ) f ( x ) d x exp ( c 2 t 2 ( λ - 1 ) / 2 λ ) .

Now, integrating both sides with respect to t , we obtain

- - exp ( t x - c 2 t 2 / 2 λ ) d t f ( x ) d x - exp ( c 2 t 2 ( λ - 1 ) / 2 λ ) d t ,

which reduces to

1 c 2 π λ - exp ( λ x 2 / 2 c 2 ) f ( x ) d x 1 c 2 π λ 1 - λ .

This simplifies to prove the lemma.

We now state our main theorem, which generalizes the results of [link] and uses substantially the same proof technique.

Suppose that X = [ X 1 , X 2 , ... , X M ] , where each X i is i.i.d. with X i Sub ( c 2 ) and E ( X i 2 ) = σ 2 . Then

E X 2 2 = M σ 2 .

Moreover, for any α ( 0 , 1 ) and for any β [ c 2 / σ 2 , β max ] , there exists a constant κ * 4 depending only on β max and the ratio σ 2 / c 2 such that

P X 2 2 α M σ 2 exp - M ( 1 - α ) 2 / κ *

and

P X 2 2 β M σ 2 exp - M ( β - 1 ) 2 / κ * .

Since the X i are independent, we obtain

E X 2 2 = i = 1 M E X i 2 = i = 1 M σ 2 = M σ 2

and hence [link] holds. We now turn to [link] and [link] . Let us first consider [link] . We begin by applying Markov's inequality:

P X 2 2 β M σ 2 = P exp ( λ X 2 2 ) exp λ β M σ 2 E exp ( λ X 2 2 ) exp λ β M σ 2 = i = 1 M E exp ( λ X i 2 ) exp λ β M σ 2 .

Since X i Sub ( c 2 ) , we have from [link] that

E exp ( λ X i 2 ) = E exp ( 2 c 2 λ X i 2 / 2 c 2 ) 1 1 - 2 c 2 λ .

Thus,

i = 1 M E exp λ X i 2 1 1 - 2 c 2 λ M / 2

and hence

P X 2 2 β M σ 2 exp - 2 λ β σ 2 1 - 2 c 2 λ M / 2 .

By setting the derivative to zero and solving for λ , one can show that the optimal λ is

λ = β σ 2 - c 2 2 c 2 σ 2 ( 1 + β ) .

Plugging this in we obtain

P X 2 2 β M σ 2 β σ 2 c 2 exp 1 - β σ 2 c 2 M / 2 .

Similarly,

P X 2 2 α M σ 2 α σ 2 c 2 exp 1 - α σ 2 c 2 M / 2 .

In order to combine and simplify these inequalities, note that if we define

κ * = max 4 , 2 ( β max σ 2 / c - 1 ) 2 ( β max σ 2 / c - 1 ) - log ( β max σ 2 / c )

then we have that for any γ [ 0 , β max σ 2 / c ] we have the bound

log ( γ ) ( γ - 1 ) - 2 ( γ - 1 ) 2 κ * ,

and hence

γ exp ( γ - 1 ) - 2 ( γ - 1 ) 2 κ * .

By setting γ = α σ 2 / c 2 , [link] reduces to yield [link] . Similarly, setting γ = β σ 2 / c 2 establishes [link] .

This result tells us that given a vector with entries drawn according to a sub-Gaussian distribution, we can expect the norm of the vector to concentrate around its expected value of M σ 2 with exponentially high probability as M grows. Note, however, that the range of allowable choices for β in [link] is limited to β c 2 / σ 2 1 . Thus, for a general sub-Gaussian distribution, we may be unable to achieve an arbitrarily tight concentration. However, recall that for strictly sub-Gaussian distributions we have that c 2 = σ 2 , in which there is no such restriction. Moreover, for strictly sub-Gaussian distributions we also have the following useful corollary. [link] exploits the strictness in the strictly sub-Gaussian distribution twice — first to ensure that β ( 1 , 2 ] is an admissible range for β and then to simplify the computation of κ * . One could easily establish a different version of this corollary for non-strictly sub-Gaussian vectors but for which we consider a more restricted range of ϵ provided that c 2 / σ 2 < 2 . However, since most of the distributions of interest in this thesis are indeed strictly sub-Gaussian, we do not pursue this route. Note also that if one is interested in very small ϵ , then there is considerable room for improvement in the constant C * .

Suppose that X = [ X 1 , X 2 , ... , X M ] , where each X i is i.i.d. with X i SSub ( σ 2 ) . Then

E X 2 2 = M σ 2

and for any ϵ > 0 ,

P X 2 2 - M σ 2 ϵ M σ 2 2 exp - M ϵ 2 κ *

with κ * = 2 / ( 1 - log ( 2 ) ) 6 . 52 .

Since each X i SSub ( σ 2 ) , we have that X i Sub ( σ 2 ) and E ( X i 2 ) = σ 2 , in which case we may apply [link] with α = 1 - ϵ and β = 1 + ϵ . This allows us to simplify and combine the bounds in [link] and [link] to obtain [link] . The value of κ * follows from the observation that 1 + ϵ 2 so that we can set β max = 2 .

Finally, from [link] we also have the following additional useful corollary. This result generalizes the main results of [link] to the broader family of general strictly sub-Gaussian distributions via a much simpler proof.

Suppose that Φ is an M × N matrix whose entries φ i j are i.i.d. with φ i j SSub ( 1 / M ) . Let Y = Φ x for x R N . Then for any ϵ > 0 , and any x R N ,

E Y 2 2 = x 2 2

and

P Y 2 2 - x 2 2 ϵ x 2 2 2 exp - M ϵ 2 κ *

with κ * = 2 / ( 1 - log ( 2 ) ) 6 . 52 .

Let φ i denote the i th row of Φ . Observe that if Y i denotes the first element of Y , then Y i = φ i , x , and thus by Lemma 2 from "Sub-Gaussian random variables" , Y i SSub x 2 2 / M . Applying [link] to the M -dimensional random vector Y , we obtain [link] .

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introduction to compressive sensing. OpenStax CNX. Mar 12, 2015 Download for free at http://legacy.cnx.org/content/col11355/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introduction to compressive sensing' conversation and receive update notifications?

Ask