<< Chapter < Page | Chapter >> Page > |
Suppose we make noisy measurements of a smooth function:
where
and
The unknown function ${f}^{*}$ is a map
In Lecture 4 , we consider this problem in the case where ${f}^{*}$ was Lipschitz on $[0,1].$ That is, ${f}^{*}$ satisfied
where $L>0$ is a constant. In that case, we showed that by using a piecewise constant function on a partition of ${n}^{\frac{1}{3}}$ equal-size bins [link] we were able to obtain an estimator ${\widehat{f}}_{n}$ whose mean square error was
In this lecture we will use the Maximum Complexity-Regularized Likelihood Estimation result we derived in Lecture 14 to extend our denoising scheme in several important ways.
To begin with let's consider a broader class of functions.
For $0<\alpha <1,$ define the space of functions
for some constant ${C}_{\alpha}<\infty $ and where $f\in {L}_{\infty}.$ ${H}^{\alpha}$ above contains functions that are bounded, but less smooth than Lipschitz functions. Indeed, the space of Lipschitzfunctions can be defined as ${H}^{1}$ ( $\alpha =1$ )
for ${C}_{1}<\infty .$ Functions in ${H}^{1}$ are continuous, but those in ${H}^{\alpha},\alpha <1,$ are not in general.
Let's also consider functions that are smoother than Lipschitz. If $\alpha =1+\beta ,$ where $0<\beta <1,$ then define
In other words, ${H}^{\alpha},1<\alpha <2,$ contains Lipschitz functions that are also differentiable and their derivatives are Hölder smooth with smoothness $\beta =\alpha -1.$
And finally, let
contain functions that have continuous derivatives, but that are notnecessarily twice-differentiable.
If $f\in {H}^{\alpha}\left({C}_{\alpha}\right)$ , $0<\alpha \le 2,$ then we say that $f$ is Hölder $-\alpha $ smooth with Hölder constant ${C}_{\alpha}.$ The notion of Hölder smoothness can also be extended to $\alpha >2$ in a straightforward way.
Note: If ${\alpha}_{1}<{\alpha}_{2}$ then
Summarizing, we can describe Hölder spaces as follows. If ${f}^{*}\in {H}^{\alpha}\left({C}_{\alpha}\right)$ for some $0<\alpha \le 2$ and ${C}_{\alpha}<\infty ,$ then
Note that in general there is a natural relationship between the Hölder space containing the function and the approximation classused to estimate the function. Here we will consider functions which are Hölder $-\alpha $ smooth where $0<\alpha \le 2$ and work with piecewise linear approximations. If we were to consider smootherfunctions, $\alpha >2$ we would need consider higher order approximation functions, i.e. quadratic, cubic, etc.
Now let's assume ${f}^{*}\in {H}^{\alpha}\left({C}_{\alpha}\right)$ for some unknown $\alpha (0<\alpha \le 2)$ ; i.e. we don't know how smooth ${f}^{*}$ is. We will use our observations
to construct an estimator ${\widehat{f}}_{n}.$ Intuitively, the smoother ${f}^{*}$ is, the better we should be able to estimate it. Can we take advantage ofextra smoothness in ${f}^{*}$ if we don't know how smooth it is? The smoother ${f}^{*}$ is, the more averaging we can perform to reduce noise. In other words for smoother ${f}^{*}$ we should average over larger bins. Also, we will need to exploit the extra smoothnessin our approximation of ${f}^{*}.$ To that end, we will consider candidate functions that are piecewiselinear functions on uniform partitions of $[0,1].$ Let
Notification Switch
Would you like to follow the 'Statistical learning theory' conversation and receive update notifications?