<< Chapter < Page Chapter >> Page >

Review

In Lecture 4 and 15 , we investigated the problem of denoising a smooth signal in additive white noise. In Lecture 4 , we considered Lipschitz functions and showed that by filling constants on auniform partition of width n - 1 / 3 we can achieve an n - 2 / 3 rate of MSE convergence.

In Lecture 15 , we considered Holder- α smooth functions, and we demonstrated that by automatically selecting partition widthand using polynomial fits we can obtain a MSE convergence rate of n - 2 α / 2 α + 1 , substantially better when α > 1 . Also important is the fact that we don't need to know the value of α a priori. The estimator f ^ n is fundamentally different than its counterpart in Lecture 4 .

In both cases f ^ n ( t ) is a linear function (polynomial on constant fit) of the data in each interval of the underlyingpartition. In Lecture 4 , the partition was independent of the data, and so the overall estimator is a linear function ofthe data .

However, in Lecture 15 the partition itself was selected based on the data. Consequently, f ^ n ( t ) is a non-linear function of the data. Linear estimators (linear functions of the data) cannot adapt to unknown degrees of smoothness. In this lecture, welay the groundwork for one more important extension in the denoising application - spatial adaptivity. That is, we would liketo construct estimators that not only adapt to unknown degrees of global smoothness, but that also adapt to spatially varyingdegrees of smoothness.

We will focus on the approximation theoretic aspects of the problem in this lecture, considering tree-based approximations andwavelet expansions. In the next lecture , we will apply these results to the denoising problem, this will bring us up to datewith the current state-of-the-art in denoising and non-parametric estimation.

Recall that Holder spaces contain smooth functions that are well approximated with polynomials or piecewise polynomial functions.Holder spaces are quite large and contain many interesting signals. However, Holder spaces are still inadequate in manyapplications. Often, we encounter functions that are not smooth everywhere; they contain discontinuities, jumps, spikes, etc.Indeed, the "singularities" (or non-smooth points) can be the most interesting and informative aspects of the functions.

Functions not smooth everywhere.

Example of functions not smooth everywhere. (a) 1-D Case (b) 2-D Case

Furthermore, functions of interest may possess different degrees of smoothness in different regions.

Functions with different degrees of smoothness.

Example of functions having different degrees of smoothness. (a) 1-D Case (b) 2-D Case

Nonlinear approximation via trees

Let B α ( C α ) denote the set of all functions that are H α ( C α ) everywhere except on a set of measure zero. To simplify the notation, we won't explicitly identify the domain(e.g., [ 0 , 1 ] or [ 0 , 1 ] d ); that will be clear from the context.

Sets of measure zero

Sets of measure zero. (a) 1-D Case (b) 2-D Case

Let's consider a 1-D case first.

Let f B α ( C α ) and consider approximating f by a piecewise polynomial function on a uniform partition.

If f is Holder- α smooth everywhere, then by using an appropriate partition width k - 1 and fitting degree α polynomials on each interval we have an approximation f k satisfying

| f ( t ) - f k ( t ) | C α k - α

and

| | f - f k | | L 2 2 = O ( k - 2 α ) .
Smooth curve with a discontinuity.

However, if there is a discontinuity then for t in the interval containing the discontinuity the difference

| f ( t ) - f k ( t ) |

will not be small.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical learning theory. OpenStax CNX. Apr 10, 2009 Download for free at http://cnx.org/content/col10532/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical learning theory' conversation and receive update notifications?

Ask