T1. Review of Econometrics I

TOC

Abstract 1. Regression Specification 2. Estimation 2.1 General Forms of OLS Estimators 2.2 Simple Regression Model 2.3 Partitioned Regression 2.4 Efficiency 3. Properties of OLS Esitmators 3.1 OLS Assumptions 3.2 Asymptotic Properties 4. WLS (GLS)5. Hypothesis Testing 5.1 T-test in regression 5.2 F-test in regression

Abstract

The article discusses OLS regression and its properties. It starts with the classical linear model assumptions, which are necessary for OLS to be the best linear unbiased estimator. The article then goes on to discuss the asymptotic properties of OLS estimators, including their consistency, unbiasedness, and asymptotic normality. The article also covers hypothesis testing in regression, including t-tests and F-tests. Overall, the article provides a comprehensive overview of OLS regression and its properties.

Warm Up

Lunch Program

Denote math10 as the percentage of grade 10th students passing the exam; denote lnprgm as the percentage of students eligible for lunch program. Obsevations include over 1000 schools (entries). Regress math10 on lnprgm, the coefficient is negative while the R square is around 0.17.

Does the R square low? Does it affects the correctness of the model?

do not affect the correctness of estimator, it only measure how much the dependent variable can be explained by the estimator(s) (independent variable(s)).

For some kinds of regression, like time-series regression, is generally high because time-series dependence. For some other situations, however, can be relative low. 17% is not a very low value indeed.

Why the coefficient is negative, what does that mean?

It means lunch program is harmful to students’ academic performance which is counterintuitive.

The problem here using OLS is omitted variable bias, that is,

1. Regression Specification

where

Matrix Form

where

Note: In following content, all bold characters like or denote vector / matrix while Italian characters like denote scalars.

2. Estimation

2.1 General Forms of OLS Estimators

OLS Estimator:

Object Function:

First order derivation

Supplement: Definition of vector derivative

Thus

and then

2.2 Simple Regression Model

Estimations are

In above form

Thus

Note that

The results using matrix form is consistent with that using simple form.

2.3 Partitioned Regression

Reg on get residual

Reg on get residual (or denoted by )

Reg (or ) on get

The effect of on after have been partialled or netted out.

2.4 Efficiency

Mean Squared Error (MSE)

3. Properties of OLS Esitmators

3.1 OLS Assumptions

Linear in parameters, namely the

It’s free to do whatever we want with — themselves, logs, squares, etc.

Random sampling i.i.d

It’s needed for inference and somewhat irrelevant for the pure mechanics of OLS.

No perfect multicollinearity

Make sure is nonsingular. ( )

Supplement: How to test multicollinearity

It means we have properly specified the model such that there are no omitted variables.

However, this is always the problematic assumption with OLS since there is no way to ever know it is valid or not.

It means nothing for the mechanics of OLS, but to ensure that the usual standard errors are valid.

It is irrelevant for the mechanics of OLS, but to ensure that the sampling distribution of the is normal, i.e.

Note that if the 6th assumption is satisfied, 4th and 5th assumptions are also satisfied consequently.

Implications:

Under 1-6 (the classical linear model assumptions), OLS is BLUE (best linear unbiased estimator), best in the sense of lowest variance. It is also efficient among all linear estimators including estimators that use some function of the .

Under 1-5 (the Gauss-Markov assumptions), OLS is BLUE and efficient.

Under 1-4, OLS is unbiased and consistent.

3.2 Asymptotic Properties

3.2.1 Unbiasedness

OLS estimators are unbiased under 1-4 assumptions above. Proof:

Then

if we assume is not a variable, then

or can promise unbiased.

Else,

Note that we need satisfied anyway (since are iid, it means ).

3.2.2 Consistency

Definition: for every , if

we say converges in probability to , i.e. or . Then is a consistent estimator for . Consistency is usually the most important property of an estimator.

Note the relation with Large Number Theorem

To prove OLS estimators’ consistency

Supplement: Variance of

Typically, calculate the conditional variance of :

For simple regression, we already know that

The variance of the slop coefficient is thus

Another proof

Using WLLN and Slutsky’s theorem

Supplement: Slutsky’s theorem:

Suppose , and . Then

where

provided

suppoer is a continuous funtion, then

Note that we can release the 4th assumption to another weaker assumption: which still promise consistency. In this circumstance, . But we cannot derive that and thus the unbiasedness is not guaranteed.

Acturally, Zero Condition Mean is a stronger assumption, it implies that is uncorrelated with any function of , while only means is uncorrelated with (not independent though).

Supplement: Independence and Uncorrelation

If two random variables are independent, then they are uncorrelated

Random X and Y are uncorrelated

Random X and Y are independent iff for any functions the random variables are uncorrelated

3.2.3 Asymptotic Normality

by CLT, where

The expression is called a sandwich form. can be estimated by and can be estimated by

Let be iid random vectors with finite mean vector and finite positive definite covariance . Then

Since , we have

Note that . Since are i.i.d and have zero expectation, then by CLT where

By the WLLN

Thus by Slutsky theorem

Supplement: Slutsky theorem

If and as , then:

As for how to estimate the variance :

Heteroskedasticity:

Homoskedasticity (i.e. )

Quiz - True or False

If the error term is correlated with on of the regressors in a multiple linear regression model, then the OLS estimators of the coefficients in the model are generally biased even in large samples.

Using non-Matrix form

and assume that and

Then

since

if is uncorrelated with that is then is uncorrelated with and thus is unbiased

else, is correlated with and then which means is biased

4. WLS (GLS)

Refer to previous Econometrics I notes:

Ch8: Heteroskedasticity

Personal blog of Richard Song

https://www.richardsong.space/ch8-heteroskedasticity#ecc32bcff09f4bccb7c3aea5f234f838

5. Hypothesis Testing

5.1 T-test in regression

Suppose one is fitting the model

We want to test the null hypothesis that the slope is equal to some specified value . Let

Then

where the standard error of the slop coefficient is

Thus

Another way to determine the is

where is the Pearson correlation coefficient.

5.2 F-test in regression

Consider two models, 1 and 2, where model 1 is the restricted model and model 2 is the unrestricted one. That is model 1 has parameters and model 2 has parameters where , and for any choice of parameters in mode 1, the same regression curve can be achieved by some choice of the parameters of model 2.

One common context is deciding whether a model with more explanatory variables (unrestricted model) fits the data significantly better than a naive model (restricted model). Another common context is deciding whether there is a structural break in the data: the restricted model uses all data in one regression while the unrestricted model uses separate regressions for two different subsets of the data. The second application context is known as the Chow test.

If there are n data points to estimate parameters of both models, then the F statistic is given by

where is the residual sum of squares of model . Under the null hypothesis that model 2 does not provide a significantly better fit than model (that is extra explanatory variables’ coefficients are all zero), . The null hypothesis is rejected if the is greater than the critical value of the F distribution for some desired false-rejection probability.