# Eco 480 Econometrics I Set 2 Due Wednesday October 14 2015 Beginning

ECO 480 Econometrics I Problem Set 2 Due: Wednesday, October 14, 2015 (beginning of the class)

1

Instruction: The problem sets are designed to be difficult and very time-intensive, so plan ahead. The problem sets consists of solving theoretical problems and analyzing real data. You may discuss the questions with your classmates, but you are required to hand in your own independently written solutions. For problems that require you to use Stata submit independently written do-files, and log-files. No late work will be accepted and I do NOT accept any electronic copy. All the data necessary for the problem set is available under UBlearns.

Important: It is extremely important to write a clean well-commented program for transparency and replication purposes. In any empirical work, you should always be able to reproduce your result from raw data to support your claim.

What to hand in: Typed write-up answering the assigned questions and interpreting your findings, do-file, and log-file for problems that require you to use Stata. For questions involving data analysis, you will NOT get any credit if you do not provide a program code. You may NOT use Excel

1. Suppose the following equation describes the relationship between the average number of classes missed during a semester (missed) and the distance from school (distance, measure in miles) (Total 4 points):

missed = 3 + 0.2 distance

a. Sketch this line, being sure to label the axes. How do you interpret the intercept in this equation? (2 points)

b. What is the average number of classes missed for someone who lives five miles away? (1 point)

c. What is the difference in the average number of classes missed for someone who lives 10 miles away and someone who lives 20 miles away? (1 points)

2. Use COLLDIS.dta for this problem. A detailed description of the data is given in COLLDIS_Description.pdf. This contains data from a random sample of high school seniors interviewed in 1980 and re-interviewed in 1986. In this exercise, you will use these data to investigate the relationship between the number of completed years of education for young adults and the distance from each students high school to the nearest four-year college. (Proximity to college lowers the cost of education, so that students who live closer to a four-year college should, on average, complete more years of higher education.) (Total 12 points)

a. Run a regression of years of completed education (ed) on distance to the nearest college (dist), where dist is measured in tens of miles. (For example, dist = 2 means that the distance is 20 miles.) What is the estimated intercept? What is the estimated slope? Use the estimated regression to answer this question: How does the average value of years of completed schooling change when colleges are built close to where students go to high school? (4 points)

b. Bobs high school was 20 miles from the nearest college. Predict Bobs years of completed education using the estimated regression. How would the prediction change if Bob lived 10 miles from the nearest college? (2 points)

c. If the distance is measured in kilometers, what is your new estimation and interpretation of the result? (4 points)

d. Beware the omitted variable. List five possible omitted variables. Are they all measurable? (2 points) [Hint: Omitted variables from the regression may or may not be measurable by econometricians.]

3. Use CPS08.dta for this problem. A detailed description of the data is given in CPS08_Description.pdf. In this exercise, you will investigate the relationship between a workers age and earnings. (Generally, older workers have more job experience, leading to higher productivity and earnings. (Total 10 points)

a. Report mean, median, and standard deviation of workers age and earning. (3 points)

b. Run a regression of average hourly earnings (AHE) on age (Age). What is the estimated intercept? What is the estimated slope? Use the estimated regression to answer this question: How much do earnings increase as workers age by 1 year? (4 points)

c. Bob is a 26-year-old worker. Predict Bobs earnings using the estimated regression. Alexis is a 30-year-old worker. Predict Alexiss earnings using the estimated regression. (1 points)

d. Does age account for a large fraction of the variance in earnings across individuals? Why? (2 points)

4. Battery packs in electric go-carts need to last a fairly long time. The run-time (time until it needs to be recharged) of the battery packs made by a particular company are Normally distributed with a mean of 2 hours and a standard deviation of 20 minutes. (Total 3 points)

a. What percentage of these battery packs lasts longer than 3 hours? Show your work. (1 point)

b. What is the third quartile for the run-time distribution? Show your work. (1 point)

c. Battery packs that have a run-time in the highest 10% of the run-time distribution are highly sought after by go-cart drivers. How long does the battery pack have to last for it to fall in this highly sought-after class? Show your work. (1 point)

5. In the language of government statistics, you are in the labor force if you are available for work and either working or actively seeking work. The unemployment rate is the proportion of the labor force (not of the entire population) who are unemployed. Here are data from the Current Population Survey (CPS) for the civilian population aged 25 years and over. The table entries are counts in thousands of people. You must show your work in answering the following questions. (Total 5 points)

Highest Education Total Population In Labor Force Employed

Did not finish high school 28,021 12,623 11,552

High school but no college 59,844 38,210 36,249

Some college, but no bachelor’s degree 46,777 33,928 32,429

College graduate 51,568 40,414 39,250

b. Find the probabilities of the following events (2 points):

i. Enough sleep and not enough exercise

ii. Not enough sleep and enough exercise

iii. Not enough sleep and not enough exercise

iv. For each of parts i, ii, iii, states the rule that you used to find your answer.

8. Facebook provides a variety of statistic on their Web site that detail the growth and popularity of the site. One such statistic is that the average user has 130 friends. This distribution only takes integer values, so it is certainly not Normal. We will also assume it is skewed to the right with a standard deviation ? = 85. Consider a SRS of 30 Facebook users. You must show your work in answering the following questions. (Total 3 points)

a. What are the mean and standard deviation of the total number of friends in this sample? (1 point)

b. What are the mean and standard deviation of the mean number of friends per user? (1 point)

c. Use the central limit theorem to find the probability that the average number of friends in 30 Facebook users is greater than 140. (1 point)

9. North Carolina State University posts the grade distribution for its courses online. Students in one section of English 210 in the Fall 2008 semester received 33% As, 24% Bs, 18% Cs, 16% Ds, and 9% Fs. You must show your work in answering the following questions. (Total 3 points)

a. Using the common scale A=4, B=3, C=2, D=1, F=0, take X to be the grade of a randomly chosen English 210 students. Use the definition of the mean and standard deviation for discrete random variables to find the mean ? and the standard deviation ? of the grades in the course. (1 point)

b. English 210 is a large course. We can take the grades of a simple random sample of 50 students to be independent of each other. If ?.?is the average of these 50 grades, what are the mean and standard deviation of ?.?? (1 point)

c. What is the probability P(?.?? 3) that the grade point average for 50 randomly chosen English 210 students is a B or better? (1 point)

10. A $1 bet in a state lotterys Pick 3 game pays $500 if the three-digit number you choose exactly matches the winning number, which is drawn at random. Here is the distribution of the payoff X:

Payoff X $0 $500

Probability 0.999 0.001

a. What are the mean and standard deviation of X? (1 point)

b. Joe buys a Pick 3 ticket twice a week. What does the law of large numbers say about the average payoff Joe receives from his bets? (1 point)

c. What does the central limit theorem say about the distribution of Joes average payoff after 104 bets in a year? (1 point)

d. Joe comes out ahead for the year if his average payoff is greater than $1(the amount he spent each day on a ticket). What is the probability that Joe ends the year head? (1 point)

11. A selective college would like to have an entering class of 950 students. Because not all students who are offered admission accept, the college admits more than 950 students. Past experience shows that about 75% of the students admitted will accept. The college decides to admit 1,200 students. Assuming that students make their decisions independently, the number who accept has the B(1200,0.85) distribution. If this number is less than 950, the college will admit students from its waiting list. You must show your work in answering the following questions. (Total 4 points)

a. What are the mean and the standard deviation of the number X of students who accept? (1 point)

b. Use the Normal approximation to find the probability that at least 800 students accept. (1 point)

c. The college does not want more than 950 students. What is the probability that more than 950 will accept? (1 point)

d. If the college decides to increase the number of admission offers to 1,300, what is the probability that more than 950 will accept? (1 point)

12. Here is a simple probability model for multiple-choice tests. Suppose that each student has probability p of correctly answering a question chosen at random from a universe of possible questions. (A strong student has a higher p than a weak student.) The correctness of an answer to a question is independent of the correctness of answers to other questions. Jodi is a good student for whom p = 0.88. You must show your work in answering the following questions. (Total 5 points)

a. Use the Normal approximation to find the probability that Jodi scores 85% or lower on a 100-question test. (1 point)

b. If the test contains 250 questions, what is the probability that Jodi will score 85% or lower? (1 point)

c. How many questions must the test contain in order to reduce the standard deviation of Jodis proportion of correct answers to half its value for a 100-item test? (2 points)

d. Lisa is a weaker student for whom p = 0.72. Does the answer you gave in part c for the standard deviation of Jodis score apply to Lisas standard deviation also? Why or why not? (1 point)

13. According to genetic theory, the blossom color in the second generation of a certain cross of sweet peas should be red or white in a 3:1 ratio. That is, each plant has probability ¾ of having red blossoms, and the blossom colors of separate plants are independent. Show your work. (3 points)

a. What is the probability that exactly 9 out of 12 of these plants have red blossoms? (1 point)

b. What is the mean number of red-blossomed plants when 120 plants of this type are grown from seeds? (1 point)

c. What is the probability of obtaining at least 80 red-blossomed plants when 120 plans are grown from seeds? (1 point)

COLLDIST description

Documentation for CollegeDistance Data

These data are taken from the HighSchool and Beyond survey conducted by the Department of Education in 1980, with a follow-up in 1986. The survey included students from approximately 1100 high schools.

The data used here were supplied by Professor Cecilia Rouse of Princeton University and were used in her paper Democratization or Diversion? The Effect of Community Colleges on Educational Attainment, Journal of Business and Economic Statistics, April 1995, Vol. 12, No. 2, pp 217-224.

The data in CollegeDistance exclude students in the western states. The data in CollegeDistanceWest includes only those students in the western states.

Series in Data Set Name Description

ed Years of Education Completed (See below)

female 1 = Female/0 = Male

black 1 = Black/0 = Not-Black

Hispanic 1 = Hispanic/0 = Not-Hispanic

bytest Base Year Composite Test Score. (These are achievement tests given to high school seniors in the sample)

dadcoll 1 = Father is a College Graduate/ 0 = Father is not a College Graduate

momcoll 1 = Mother is a College Graduate/ 0 = Mother is not a College Graduate

incomehi 1 = Family Income > $25,000 per year/ 0 = Income ? $25,000 per year.

ownhome 1= Family Owns Home / 0 = Family Does not Own Home

urban 1 = School in Urban Area / = School not in Urban Area

cue80 County Unemployment rate in 1980

stwmfg80 State Hourly Wage in Manufacturing in 1980

dist Distance from 4yr College in 10’s of miles

tuition Avg. State 4yr College Tuition in $1000’s

CPS08 description

Documentation for CPS08 Data

Each month the Bureau of Labor Statistics in the U.S. Department of Labor conducts the Current Population Survey (CPS), which provides data on labor force characteristics of the population, including the level of employment, unemployment, and earnings. Approximately 65,000 randomly selected U.S. households are surveyed each month. The sample is chosen by randomly selecting addresses from a database comprised of addresses from the most recent decennial census augmented with data on new housing units constructed after the last census. The exact random sampling scheme is rather complicated (first small geographical areas are randomly selected, then housing units within these areas randomly selected); details can be found in the Handbook of Labor Statistics and is described on the Bureau of Labor Statistics website (www.bls.gov).

The survey conducted each March is more detailed than in other months and asks questions about earnings during the previous year. The file CPS08 contains the data for 2008 (from the March 2009 survey). These data are for full-time workers, defined as workers employed more than 35 hours per week for at least 48 weeks in the previous year. Data are provided for workers whose highest educational achievement is (1) a high school diploma, and (2) a bachelors degree.

Series in Data Set:

FEMALE: 1 if female; 0 if male

YEAR: Year

AHE : Average Hourly Earnings

BACHELOR: 1 if worker has a bachelors degree; 0 if worker has a high school degree

# Our guarantees

Study Acers provides students with tutoring and help them save time, and excel in their courses. Students LOVE us!No matter what kind of essay paper you need, it is simple and secure to hire an essay writer for a price you can afford at StudyAcers. Save more time for yourself. Delivering a high-quality product at a reasonable price is not enough anymore.

That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

### Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more### Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more### Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more### Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more### Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more