E471: Econometric Theory and Practice I Spring 2020

Assignment 2

Due: Thursday, February 27, 2020, 1:00 pm

Instructions:

? Please upload an electronic copy of your answers to Canvas combining all results in

one pdf/word file. If there is any handwritten part, please scan it and include it with

the rest of the answers.

? Please also upload your code to Canvas before the due time. The code accounts for

50% of the points of the empirical questions.

? You are allowed to collaborate in groups, but required to write up answers and code

independently. Direct copying will be treated as cheating. Please write the names of

your collaborators at the beginning of your work, if any.

Questions:

1. This question is a sequel to Question 1 in Assignment 1. Reall that you downloaded

the file Assignment1Data.csv, which contains three series: CITCRP, MARKET, and

RKFREE with data from January 1978 to December 1987.

Recall the regression代写E471留学生作业、代做Econometric Theory作业、代做R编程设计作业、代写R语言作业

rc,i ? rf,i = β0 + β1(rm,i ? rf,i) + i. (1)

The capital asset pricing model (CAPM) suggests that β0 should be zero. Using the R

chunks from our lectures, extend your program to:

(a) compute a p-value for the null hypothesis H0 : β0 = 0;

(b) and to construct 90%, 95%, and 99% confidence intervals for β0 and β1.

2. Suppose that E[U|X] = 0, E[X2] = μXX, and V[U|X] = σ2. We used the following two

results in our derivations for the OLS estimator:

(a) Show that E[UX] = 0.

(b) Show that V[UX] = E[U2X2] = σ2μXX

3. Consider the following regression model

Yi = βXi + Ui, β = 1, E[X2

i ] = 1

4

, V(Ui|Xi) = 1

and the sample size is n = 100. We are interested in testing the null hypothesis that

β = βH versus the alternative that β 6= βH using a t-statistic of the form

t = β? ? βH

σ?β?

.

(a) What is the correct critical value to guarantee that the hypothesis test has a type-I

error of 5%?

(b) Define the type-II error of a hypothesis test.

(c) Suppose an econometrician tests the “false” hypothesis that β = βH = 1.2 (the

“truth” is that β = 1). What is the type-II error associated with this test?

(d) Repeat the calculation for βH = 1.6, βH = 1.8, and βH = 2.

(e) What is the power of a test?

(f) True or False: The t-test has less power against alternative hypotheseis that are

far away from the null hypothesis. Hint: use your previous results to answer this

question.

Page 2

4. You are working for a life insurance company and are preparing for a briefing of the

board. Your economic intuition tells you that the best predictor of life insurance holdings

is income. You gather the relevant data (family life insurance and family income, both

in thousands of dollars) and want to analyze it, running a regression

lifeinsi = β0 + β1incomei + ui

The data set is provided in the file Assignment2Data.csv.

(a) Estimate the above regression model. What is your point estimate for β1? What is

the interpretation of that estimate?

(b) Provide a 95% confidence interval for the coefficient on income.

(c) One of the managers suggests that the industry rule of thumb is that people buy five

dollars life insurance for each additional dollar of their income. Another manager

disagrees and says it could be more or less. You want to examine the difference of

opinion.

What null and alternative hypotheses would you use here to discriminate between

these hypotheses? Test the hypothesis, using a 5% type I error and interpret the

results.

(d) Calculate a p-value for the test in (c).

(e) The company wants to offer life insurance to low income households. The chairman

asks you how much life insurance would a household with an income of 20,000

dollars buy. Calculate a point prediction.

(f) Explain why it is better to consider an interval prediction rather than a point

prediction.

(g) Calculate 90%, 95%, and 99% prediction intervals for the life insurance holding of

a family that earns 20,000 dollars.

Page 3

5. The goal is to replicate Table 2 of Acemoglu, Johnson, and Robinson (AER, 2001). You

can find the article at

http://www.aeaweb.org/articles.php?doi=10.1257/aer.91.5.1369

and the data at

http://economics.mit.edu/faculty/acemoglu/data/ajr2001

Download the zip file for the replication of Table 2, and extract the file maketable2.dta

into your work directory. The data set is provided in STATA format which can be read

by R (see below).

(a) Load the data using the following R chunk.

wholeworld = foreign::read.dta("maketable2.dta")

head(wholeworld)

## shortnam africa lat_abst avexpr logpgp95 other asia

## 1 AFG 0 0.3667 NA NA 0 1

## 2 AGO 1 0.1367 5.364 7.771 0 0

## 3 ARE 0 0.2667 7.182 9.804 0 1

## 4 ARG 0 0.3778 6.386 9.133 0 0

## 5 ARM 0 0.4444 NA 7.682 0 1

## 6 AUS 0 0.3000 9.318 9.898 1 0

## loghjypl baseco

## 1 NA NA

## 2 -3.4112 1

## 3 NA NA

## 4 -0.8723 1

## 5 NA NA

## 6 -0.1708 1

(b) How many observations are in the “whole world” sample? What does “NA” mean?

(c) What is the “base sample” considered in the paper? (you have to read the text,

e.g., section II.A., to answer this question!) The following R chunk generates the

base sample:

basesample = wholeworld[is.na(wholeworld[,9])==FALSE,]; # delete NA

head(basesample)

## shortnam africa lat_abst avexpr logpgp95 other asia

## 2 AGO 1 0.1367 5.364 7.771 0 0

## 4 ARG 0 0.3778 6.386 9.133 0 0

## 6 AUS 0 0.3000 9.318 9.898 1 0

## 12 BFA 1 0.1444 4.455 6.846 0 0

## 13 BGD 0 0.2667 5.136 6.877 0 1

## 16 BHS 0 0.2683 7.500 9.285 0 0

## loghjypl baseco

## 2 -3.4112 1

## 4 -0.8723 1

## 6 -0.1708 1

## 12 -3.5405 1

## 13 -2.0636 1

Page 4

## 16 NA 1

(d) For the whole world sample and the base sample generate a scatter plot of log per

capita GDP (y-axis) versus expropriation risk (x-axis).

(e) Here is some R chunk for the estimation of specification (1) in Table 2.

olsspec1 <- lm(wholeworld$logpgp95 ~ wholeworld$avexpr)

summary(olsspec1)

##

## Call:

## lm(formula = wholeworld$logpgp95 ~ wholeworld$avexpr)

##

## Residuals:

## Min 1Q Median 3Q Max

## -1.902 -0.316 0.138 0.422 1.441

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 4.6261 0.3006 15.4 <2e-16 ***

## wholeworld$avexpr 0.5319 0.0406 13.1 <2e-16 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Residual standard error: 0.718 on 109 degrees of freedom

## (52 observations deleted due to missingness)

## Multiple R-squared: 0.611, Adjusted R-squared: 0.608

## F-statistic: 171 on 1 and 109 DF, p-value: <2e-16

How many observations are used in this regression: fewer than in the whole world

sample? Why? Does the number of observations used in your estimation match the

number of observations reported in the paper?

(f) Now write a chunk of R code that estimates specification (2) in Table 2. Do you

get the same point estimate? Do you get the same standard error estimate?

(g) Follow lecture slides and compute White’s heteroskedasticity consistent standard

errors. Are they bigger or smaller than those under homoskedastic assumption?

(h) Now replicate the estimation results for specifications (3) - (8). Report point estimates,

standard error estimates, R2, and number of coefficients up to four significant

digits (the paper only reports two significant digits).

Page 5

如有需要，请加QQ：99515681 或邮箱：99515681@qq.com 微信：codehelp