๐Ÿ“Ž

Ch2: The Simple Regression Model

TOC

Interpretation and Estimation

Simple Regression Model
๐ŸŽฏ
Derive the linear regression model in three ways. Though all the three ways are identical mathematically (the equation of coefficients), they are conceptually different.

1. Descriptive Analysis

when there exists such model (relation), we can descripe :
Since โ€œFor descriptive analysis, we can know the answer if we have enough dataโ€, thus we can derive the model using sample analogue.
Define , thus
therefore,
Assume that we observe a sample with size , carry out sample analogue using Method of Moments:
where , then we can derive the estimated coefficients of the model:
โ˜๐Ÿป
In descriptive analysis, regression model depicts the expectation of conditional on . The model shows how the two variables are correlated. But it does not mean they have a causal relationship which needs more assumptions.

2. Causal Estimation

We want a model about how the data are generated. For the descriptive analysis, we start with the data and ask which model could help summarize it (the coefficients). While for the causal analysis, we start with the model and then use it to imply what the data is supposed to be. (โ€Causal relationship is always derived from model with assumptionsโ€)
Define the simple regression model:(Causal Model)
  • and are unknown numbers in the nature we want to uncover.
  • We choose while nature chose in a way that is unrelated to our choice of .
Assumption: Zero Conditional Mean :
  1. is constant, not varying with .
  1. Its value is 0. If not, we can use normalization to make it.
With this assumption, we can prove that:
which means that reflects the causal effects of on .
Besides, with this assumption, we can also get
As before, we can derive that

3. Forecasting

Using model , we make prediction. Define the predicted value as and
For forecasting problems, we can โ€œknow the answer if the time is long enough and we have the data.โ€ In other word, we would know the actual data of . Assume we having known it now, the distance between the actual data and predicted data is and the distance for all data points (which evaluates the quality of the prediction model using data in hand) is
Intuitively, we should choose and to minimize the funtion .
The first order conditions:
Again, we derive the coefficients

Properties of Simple Regression Model

Regardless of interpretations of the simple regression model, we derive the same coefficients of it. And the model reflect.
Served as population regression model, it implies that .
Population Regression Model
Population Regression Model
Served as sample regression model, it minimize the sum of residual
Sample Regression Model
Sample Regression Model
Besides, the model (OLS estimators) have following properties.

1. Goodness of Fitness

Define:
  • Total sum of squares (SST): .
  • Explained sum of squares (SSE): .
  • Residual sum of squares (SSR):
For OLS estimators,
Proof:
Define to measure variation in explained by the model.
  • measures how well the model fits the data โ€” the correlation between and . It does not say anything about the causal relationship between them
  • is always between 0 and 1

2. Expected Values and Variances

For OLS estimators, they are determined by the sample. Thus different samples will produce different estimates. When we draw multiple samples and calculate them, the estimates will have a probabilistic distribution. Although we can only guess their distribution, we can calculate their mean and variance yet.
notion image
More Assumptions
When we add more assumptions, the OLS estimators will have a desired feature โ‡’ Unbiasedness
  1. SLR.1 Linear in Parameters:
      • โœ”๏ธ while โŒ
      • Note that itโ€™s okay is we regress on , because we can chage the model to . When interpreting the partial effect, follow below
      notion image
  1. SLR.2 Random Sampling:
      • Which means .
      • Time series data typically do not satisfy it.
  1. SLR.3 Sample Variation in the Explanatory variables:
      • i.e. are not all the same, otherwise, the denominator of OLS estimator will be 0.
  1. SLR.4 Zero Conditional Mean :
      • using which we derive the OLS estimator.
      • This condition is critical to make sure a causal interpretation.
      • We should examine it case by case, sometimes using economic theories.
Expected Values of OLS estimators
With SLR.1-SLR.4, the unbiasedness of OLS can be proved
notion image
notion image
notion image
Variance of OLS estimators
Another assumption SLR.5 Homoskedasticity โ‡’
notion image
Under SLR.1-SLR.5
Proof
notion image
notion image
notion image
ย 
Since is unknown, we need to estimate it using the sample:
๐Ÿ’ก
Since when we calculate OLS, the residual satisfy two conditions: and . has degrees of freedom while has degrees of freedom. By adjusting the degree of freedom, we can get an unbiased estimator of

Loading Comments...