<< Chapter < Page | Chapter >> Page > |
. xtreg depvar [varlist], fe
The command for estimating the random-effects model is:
. xtreg depvar [varlist], re
If the part of the command with the comma and either re or fe is omitted, Stata will assume that you want to estimate the random-effects model.
To understand the Stata output we need to return to the algebra of the model. Assume that we are fitting a model of the following form:
We can sum (13) over t (holding the individual unit constant) and divide by T to get:
where ${\overline{y}}_{i}=\frac{{\displaystyle \sum _{t=1}^{T}{y}_{it}}}{T},$ ${\overline{x}}_{ji}=\frac{{\displaystyle \sum _{t=1}^{T}{x}_{it}}}{T},$ and ${\overline{\epsilon}}_{i}=\frac{{\displaystyle \sum _{t=1}^{T}{\epsilon}_{it}}}{T}.$ Thus, (14) uses the mean values for each cross-sectional unit. We can subtract (14) from (13) to get:
Equations (13), (14), and (15) are the basis of Stats’s estimates of the parameters of the model. In particular, the command xtreg, fe uses OLS to estimate (15); this is known as the fixed-effects estimator (or the within estimator). The command xtreg, be uses OLS to estimate (14) and is known as the between estimator. The command xtreg, re —the random-effects estimator—is a weighted average of the between and within estimators, where the weight is a function of the variances of and ( and respectively). See Cameron and Trivedi (2005: 705] for a detailed discussion of the random-effects estimator.
In general, you will not make use of the between estimator. However, these three equations do lie at the basis of the goodness-of-fit measures that Stata reports. In particular, Stata output reports three “R-squareds” R-squared is in quotes in this line because these R-squareds do not have all the properties of OLS R-squareds. —the overall-R ^{2} the between-R ^{2} and the within-R ^{2} These three R-squareds are derived using one of the three equations. In particular, the overall- R ^{2} uses (13); the between- R ^{2} uses (14); and the within- R ^{2} uses (15).
In this example we follow the example offered in the Stata manual and use a large data set from the National Longitudinal Survey of wage data on 28,534 women who were between 14 and 26 years of age in 1968. The women were surveyed in each of the 21 years between 1968 and 1988 except for the six years 1974, 1976, 1979, 1981, 1984, and 1986. The study is focused on the determinants of wage levels, as measured by the natural logarithm of real wages.
Figure 1 shows the commands used to put the data into Stata . The first command ( set memory 5m ) increases the size of the memory that the program uses; I did this because of the large sample size. The use command accesses that data from the Stata web site. The describe command calls up a description of the variables. Figure 2 presents a summary of the data using the command summerize .
There are several transformations of the variables that we will need. In particular, we want to include the squares of several of the variables in our regression—age ( age ), work experience ( ttl_exp ), and job tenure ( tenure ). The reason we want to use the square of these variables is that we have reason to believe that wages have a non-linear relationship with these variables. For instance, consider the number of years a worker has been on the job, Tenure . Theory suggests that wages increase over a worker’s work-life at a decreasing rate. Thus, if the equation we are estimating is $y=\mathrm{ln}w={\beta}_{0}+{\beta}_{1}Tenure+{\beta}_{2}Tenur{e}^{2}+\cdots ,$ what we expect is that: $\frac{\partial y}{\partial Tenure}={\beta}_{1}+2{\beta}_{2}Tenure>0$ and $\frac{{\partial}^{2}y}{\partial Tenur{e}^{2}}=2{\beta}_{2}<0.$ The only way that this last equation can be true is if ${\beta}_{2}<0.$ Moreover, if this is true, the first-derivative implies that ${\beta}_{1}>-2{\beta}_{2}Tenure>0.$ Also, notice that we can determine the number of years in a job when wages reach a peak; y reaches a maximum at the age where $\frac{\partial y}{\partial Tenure}={\beta}_{1}+2{\beta}_{2}Tenure=0$ . or when $Tenure=-\frac{{\beta}_{1}}{2{\beta}_{2}}.$ The fact that $\frac{{\partial}^{2}y}{\partial Tenur{e}^{2}}=2{\beta}_{2}<0$ guarantees that this point is indeed a maximum.
Notification Switch
Would you like to follow the 'Econometrics for honors students' conversation and receive update notifications?