TOC
Abstract1. Regression Specification2. Estimation2.1 General Forms of OLS Estimators2.2 Simple Regression Model2.3 Partitioned Regression2.4 Efficiency3. Properties of OLS Esitmators3.1 OLS Assumptions3.2 Asymptotic Properties4. WLS (GLS)5. Hypothesis Testing5.1 T-test in regression5.2 F-test in regression
Abstract
The article discusses OLS regression and its properties. It starts with the classical linear model assumptions, which are necessary for OLS to be the best linear unbiased estimator. The article then goes on to discuss the asymptotic properties of OLS estimators, including their consistency, unbiasedness, and asymptotic normality. The article also covers hypothesis testing in regression, including t-tests and F-tests. Overall, the article provides a comprehensive overview of OLS regression and its properties.
Warm Up
Lunch Program
Denote math10 as the percentage of grade 10th students passing the exam; denote lnprgm as the percentage of students eligible for lunch program. Obsevations include over 1000 schools (entries). Regress math10 on lnprgm, the coefficient is negative while the R square is around 0.17.
Does the R square low? Does it affects the correctness of the model?
- do not affect the correctness of estimator, it only measure how much the dependent variable can be explained by the estimator(s) (independent variable(s)).
- For some kinds of regression, like time-series regression, is generally high because time-series dependence. For some other situations, however, can be relative low. 17% is not a very low value indeed.
Why the coefficient is negative, what does that mean?
- It means lunch program is harmful to studentsโ academic performance which is counterintuitive.
- The problem here using OLS is omitted variable bias, that is,
1. Regression Specification
where
Matrix Form
where
Note: In following content, all bold characters like or denote vector / matrix while Italian characters like denote scalars.
2. Estimation
2.1 General Forms of OLS Estimators
OLS Estimator:
Object Function:
First order derivation
Supplement: Definition of vector derivative
Thus
and then
2.2 Simple Regression Model
Estimations are
In above form
Thus
Note that
The results using matrix form is consistent with that using simple form.
2.3 Partitioned Regression
- Reg on get residual
- Reg on get residual (or denoted by )
- Reg (or ) on get
The effect of on after have been partialled or netted out.
2.4 Efficiency
Mean Squared Error (MSE)
3. Properties of OLS Esitmators
3.1 OLS Assumptions
- Linear in parameters, namely the
- Itโs free to do whatever we want with โ themselves, logs, squares, etc.
- Random sampling i.i.d
- Itโs needed for inference and somewhat irrelevant for the pure mechanics of OLS.
- No perfect multicollinearity
- Make sure is nonsingular. ( )
- Supplement: How to test multicollinearity
- It means we have properly specified the model such that there are no omitted variables.
- However, this is always the problematic assumption with OLS since there is no way to ever know it is valid or not.
- It means nothing for the mechanics of OLS, but to ensure that the usual standard errors are valid.
- It is irrelevant for the mechanics of OLS, but to ensure that the sampling distribution of the is normal, i.e.
Note that if the 6th assumption is satisfied, 4th and 5th assumptions are also satisfied consequently.
Implications:
- Under 1-6 (the classical linear model assumptions), OLS is BLUE (best linear unbiased estimator), best in the sense of lowest variance. It is also efficient among all linear estimators including estimators that use some function of the .
- Under 1-5 (the Gauss-Markov assumptions), OLS is BLUE and efficient.
- Under 1-4, OLS is unbiased and consistent.
3.2 Asymptotic Properties
3.2.1 Unbiasedness
OLS estimators are unbiased under 1-4 assumptions above. Proof:
Then
if we assume is not a variable, then
or can promise unbiased.
Else,
Note that we need satisfied anyway (since are iid, it means ).
3.2.2 Consistency
Definition: for every , if
we say converges in probability to , i.e. or . Then is a consistent estimator for . Consistency is usually the most important property of an estimator.
Note the relation with Large Number Theorem
To prove OLS estimatorsโ consistency
Supplement: Variance of
Typically, calculate the conditional variance of :
For simple regression, we already know that
The variance of the slop coefficient is thus
Another proof
Using WLLN and Slutskyโs theorem
Supplement: Slutskyโs theorem:
Suppose , and . Then
- where
- provided
- suppoer is a continuous funtion, then
Note that we can release the 4th assumption to another weaker assumption: which still promise consistency. In this circumstance, . But we cannot derive that and thus the unbiasedness is not guaranteed.
Acturally, Zero Condition Mean is a stronger assumption, it implies that is uncorrelated with any function of , while only means is uncorrelated with (not independent though).
Supplement: Independence and Uncorrelation
- If two random variables are independent, then they are uncorrelated
- Random X and Y are uncorrelated
- Random X and Y are independent iff for any functions the random variables are uncorrelated
3.2.3 Asymptotic Normality
by CLT, where
The expression is called a sandwich form. can be estimated by and can be estimated by
Let be iid random vectors with finite mean vector and finite positive definite covariance . Then
Since , we have
Note that . Since are i.i.d and have zero expectation, then by CLT where
By the WLLN
Thus by Slutsky theorem
Supplement: Slutsky theorem
If and as , then:
- if
As for how to estimate the variance :
- Heteroskedasticity:
- Homoskedasticity (i.e. )
Quiz - True or False
If the error term is correlated with on of the regressors in a multiple linear regression model, then the OLS estimators of the coefficients in the model are generally biased even in large samples.
Using non-Matrix form
and assume that and
Then
since
- if is uncorrelated with that is then is uncorrelated with and thus is unbiased
- else, is correlated with and then which means is biased
4. WLS (GLS)
Refer to previous Econometrics I notes:
5. Hypothesis Testing
5.1 T-test in regression
Suppose one is fitting the model
We want to test the null hypothesis that the slope is equal to some specified value . Let
Then
where the standard error of the slop coefficient is
Thus
Another way to determine the is
where is the Pearson correlation coefficient.
5.2 F-test in regression
Consider two models, 1 and 2, where model 1 is the restricted model and model 2 is the unrestricted one. That is model 1 has parameters and model 2 has parameters where , and for any choice of parameters in mode 1, the same regression curve can be achieved by some choice of the parameters of model 2.
One common context is deciding whether a model with more explanatory variables (unrestricted model) fits the data significantly better than a naive model (restricted model). Another common context is deciding whether there is a structural break in the data: the restricted model uses all data in one regression while the unrestricted model uses separate regressions for two different subsets of the data. The second application context is known as the Chow test.
If there are n data points to estimate parameters of both models, then the F statistic is given by
where is the residual sum of squares of model . Under the null hypothesis that model 2 does not provide a significantly better fit than model (that is extra explanatory variablesโ coefficients are all zero), . The null hypothesis is rejected if the is greater than the critical value of the F distribution for some desired false-rejection probability.
ย
Loading Comments...