<< Chapter < Page | Chapter >> Page > |
This shows that the Bernoulli distribution can be written in the form of Equation [link] , using an appropriate choice of $T$ , $a$ and $b$ .
Let's now move on to consider the Gaussian distribution. Recall that, when deriving linear regression, the value of ${\sigma}^{2}$ had no effect on our final choice of $\theta $ and ${h}_{\theta}\left(x\right)$ . Thus, we can choose an arbitrary value for ${\sigma}^{2}$ without changing anything. To simplify the derivation below, let'sset ${\sigma}^{2}=1$ . If we leave ${\sigma}^{2}$ as a variable, the Gaussian distribution can also be shown to be in the exponential family, where $\eta \in {\mathbb{R}}^{2}$ is now a 2-dimension vector that depends on both $\mu $ and $\sigma $ . For the purposes of GLMs, however, the ${\sigma}^{2}$ parameter can also be treated by considering a more general definition of the exponential family: $p(y;\eta ,\tau )=b(a,\tau )exp(({\eta}^{T}T\left(y\right)-a\left(\eta \right))/c\left(\tau \right))$ . Here, $\tau $ is called the dispersion parameter , and for the Gaussian, $c\left(\tau \right)={\sigma}^{2}$ ; but given our simplification above, we won't need the more general definition for the examples we will consider here. We then have:
Thus, we see that the Gaussian is in the exponential family, with
There're many other distributions that are members of the exponential family: The multinomial (which we'll see later), the Poisson (for modelling count-data;also see the problem set); the gamma and the exponential (for modelling continuous, non-negative random variables, such as time-intervals); the beta and the Dirichlet(for distributions over probabilities); and many more. In the next section, we will describe a general “recipe” for constructing models in which $y$ (given $x$ and $\theta $ ) comes from any of these distributions.
Suppose you would like to build a model to estimate the number $y$ of customers arriving in your store (or number of page-views on your website) in any givenhour, based on certain features $x$ such as store promotions, recent advertising, weather, day-of-week, etc. We know that the Poisson distributionusually gives a good model for numbers of visitors. Knowing this, how can we come up with a model for our problem? Fortunately,the Poisson is an exponential family distribution, so we can apply a Generalized Linear Model (GLM).In this section, we will we will describe a method for constructing GLM models for problems such as these.
More generally, consider a classification or regression problem where we would like to predict the value of some random variable $y$ as a function of $x$ . To derive a GLM for this problem, we will make the following threeassumptions about the conditional distribution of $y$ given $x$ and about our model:
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?