Eco 72, Lab 2
|
||||||||||||
|
The file http://myweb.liu.edu/~tbarr/eco72/eco72-lab2-data.xls contains data from a study of an internship program involving several hundred employers. There are several variables in the file:
(Column A) An employer ID (Column B) The number of days it takes the intern to learn the job (Column C) The number of weeks the internship lasts (Column D) Whether or not the employer screens interns by phone (Column E) Whether or not the employer screens the interns with a written exam (Column F) Whether or not the employer solicits letters of recommendation for interns (Column G) The size of the employer.
We want to test several hypotheses about these variables. Specifically, we are interested in the sample mean and sample proportion.
Several of the sample properties we have looked at in recent lectures have an equivalent Excel function, some of which you have seen before. Suppose our sample lies in the cells A2 through A10. In this case, the Excel function for these properties would be:
The COUNT function here reports the number of non-empty cells in the range. If you just report the total number of cells (e.g., 9 in this case), then you will get an incorrect number if there are missing observations.
Therefore, putting a few of these things together, we can generate a lower and upper bound for a 95% confidence interval:
Lower Bound (X -1.95s/n.5) =AVERAGE(A2:A10)-1.96*STDEV(A2:A10)/SQRT(COUNT(A2:A10))
Upper bound (X + 1.95s/n.5) =AVERAGE(A2:A10)+1.96*STDEV(A2:A10)/SQRT(COUNT(A2:A10))
Similarly, if we want the z statistic for a hypothesis test that our sample mean is 100, we could just use: =(AVERAGE(A2:A10) - 100)/(STDEV(A2:A10)/SQRT(COUNT(A2:A10)))
Notice all of the parentheses above. You can never use too many parentheses with Excel; it's extremely important that things get calculated in the right order.
One
trick that you may remember about
a sample proportion: If the "no's" in the sample are zero and the
"yesses" are 1, then the proportion
of "yesses"
in the sample
is the same as the sample average. So, for a 95% confidence interval for a sample proportion, we can use:
Lower Bound (p - 1.96[p(1-p)/n].5) =AVERAGE(A2:A10)-1.96*SQRT((AVERAGE(A2:A10)*(1-AVERAGE(A2:A10)/COUNT(A2:A10))
Upper bound (p - 1.96[p(1-p)/n].5) =AVERAGE(A2:A10)+1.96*SQRT((AVERAGE(A2:A10)*(1-AVERAGE(A2:A10)/ COUNT(A2:A10))
Similarly, for the test statistic of the null hypothesis that p is, say, 0.5, we could use:
SQRT((AVERAGE(A2:A10)*(1-AVERAGE(A2:A10)/COUNT(A2:A10))
Again, make ample use of parentheses. There is no such thing as using too many.
What I would like you to do with all of this:
Make a table in rows J through O. The top row of the table should contain the name or a description of one of the variables in columns B through G. Then, in the second row, list the sample mean. In the third row, list the lower bound of a 95% confidence interval for the sample mean. In the fourth row, list a 95% upper bound.
Be sure to note that the screening questions require use of the method for a sample proportion.
Hand in a printout showing the table, and one showing the codes. |