Eco 72, Lab 5
|
|
In this example, we’re going to see whether people with more schooling earn more. We’ll use the same data set as for lab 4: http://phoenix.liu.edu/~tbarr/eco72/eco72-lab4-data.xls. In Column C, as you may remember, is the person’s wage in 1990, and in Column D is how many years of school the person had. So the wage variable will be our dependent variable, and the schooling variable will be our independent variable. Calculating the regression coefficient There is a simple Excel function to calculate a regression coefficient. It’s called LINEST, and it takes four arguments. Unfortunately, the LINEST function doesn’t handle missing values very well, so we’ll have to calculate the regression coefficient manually (which is always good exercise anyway). Let’s
make a table. In column K1, write “Wages
and Schooling” In column J2, write
Covariance. The Excel function for
covariance is =COVAR(RANGE1,RANGE2). Put the covariance of the two variables in
column Next, let’s put the sample variance of wages in cell K3 and of schooling in cell K4. Be sure to label each of these variables in column I. Next, let’s calculate the correlation coefficient in cell K5, which is the covariance of the two variables divided by both of their standard deviations, or K2/SQRT(K3*K4). Put a label in cell J5. Next, in cell K6, calculate the regression coefficient, which should simply be K2/K4. Label it in cell J6. Next in cell K7, put the regression constant. Remember, the regression constant is equal to the mean of the independent variable minus the coefficient times the mean of the dependent variable. In other words, =AVERAGE(C2:C301)-K6*AVERAGE(D2:D301). Label it in J7. Finally,
we’re going to create a new variable that is the predicted wage given
someone’s schooling. In
other words, a+bX.
In cell H1, type =K7+K6*D1 (you may want to put in the
actual values
of K6 and K7 here). Then
auto-fill all the way down to the end. A Scatterplot First, let’s construct a scatterplot of the two variables. From the Insert menu, click Chart (or simply click the Chart icon). Select scatterplot and choose the basic kind (with no lines). Hit next. We’re going to create two series for the charts. In both cases, the X values for the series are going to be the level of schooling in cells D2:D301; in one case, the Y values for the series will be the actual wage in cells C2:C301 (we’ll name this series “Actual Wage”) and in the second case, the Y values will be the predicted wage in cells H2:H301. Once you’ve figured out how to set this, in the third stage you should give your chart a title, and you should also give titles to your X axis (years of schooling) and your Y axis (wage). If
you did this right, you’ll have a scatterplot
showing wages and years of schooling, plus a bunch of dots showing the
regression line. |