๐Ÿ“Ž

Ch12: Serial Correlation

TOC

Serial Correlation

1. Time Series Data

Classical Assumptions about Time Series Data:
  • TS.1 The stochastic process:
    • follows the linear model:
  • TS.2 No perfect collinearity
  • TS.3 Zero conditional mean:
    • where is the explanatory variables for all tiem periods.
    • It means both and
  • Note that time-series data often violate the assumption of random sampling.
๐Ÿ”ฅ
Under assumptions TS.1, TS.2 and TS.3, the OLS estimators are unbiased and consistent.

2. Variance of OLS estimator

for model
the OLS estimator is
thus,
In general, the numerator is
  • Under random sampling, different observations are independent of each other, thus
  • Then we can use the robust se formula:

3. Serial Correlation

No serial correlation assumptions:
Or
The assumption is often violated for time-series data.

4. AR(1) Serial Correlation

For the simple regression model:
Assume that
when , and are i.i.d with . This is called an autoregressive process of order one (AR(1)).
Properties of the AR(1):
Thus, (because are i.i.d)
Besides,
And
๐Ÿ’ฅ
Futher assume that and homoskedasticity, that is (we want to qualitatively analyze the effect of serial correlation, so we add some assumptions to make the derivation simpler), then
๐Ÿ”ฅ
Consequence of Ignoring or not fixing Serial Correlation: 1. The OLS estimators are still unbiased and consistent. 2. The conventional variance () and se () formulas are no longer vaild. 3. As for AR(1), in most applications, , and . Then the conventional se formula will underestimate the true se of and in practice, this bias can be sizable.

5. Fix Serial Correlation

Two solutions:
  1. Use other estimators (FGLS) which is BLUE and find out the correct standard errors for FGLS.
  1. Use OLS and correct for the standard error formula.
Solution1 FGLS
  • Model:
  • Serial Correlation - AR(1): ,
    • where is i.i.d and
  • Assumptions: TS.1 - TS.3 and .
Transform the regression:
FGLS steps:๐Ÿค“
  1. Estimate the model using OLS and obtain the OLS residuals .
  1. Use OLS to estimate and obtain .
  1. Calculate and , then use OLS to regress on .
For higher orders:
The transformation requires calculating
To estimate , we use the OLS residuals by regressing on
Solution 2: Serial Correlation-Robust Inference after OLS
Directly estimate for the OLS estimator.
Consider the multiple regression model: (1)
It can be proved that
where is the error term in (2)
HAC - Heteroskedasticity and auto-correlation consistent se
Let denote the residuals from regressing on all other independent variables, and as the OLS residual from regressing on all .
Define: (3)
Then (4)
where is the conventional standard error of and is the square root of the sum of the OLS residual squared.
  • We use to capture how much serial correlation we are allowing in computing the standard error.
  • For annual data, choose or
  • Use a larger for larger sample size.
  • When
๐Ÿ”ฅ
This formula is robust to arbitrary serial correlation and arbitrary heteroskedasticity. So people sometimes call this heteroskedasticity and auto-correlation consistent, or HAC, standard errors.
HAC steps:๐Ÿง
  1. Estimate (1) by OLS, get , and OLS residuals .
  1. Compute the residuals from the regression of (2). Then, form for each .
  1. For the choice of , compute as in (3).
  1. Compute from (4).
STATA command: newey Y X, lag(g) when put lag(0), it is equivalent to using heteroskedasticity robust standard errors.

Spatial Correlation

1. Definition

Handle data with a group structure. For example, students from different schools, workers from different firms, individuals from different provinces etc.
For model:
Assume that . However, for observations within the same is not independent
where the is called intraclass correlation coefficient.
๐Ÿ”ฅ
With Spatial correlation: 1. OLS estimator is unbiased and consistent. 2. The conventional se formula is no longer valid. 3. The conventional standard error formula may over or under estimate the true standard deviation of the OLS estimator (depends on whether observations from the same group is positively or negatively correlated to each other)

2. Fix Spatial Correlation

Solution 1: Use OLS estimators and cluster standard errors
๐Ÿ› 
The general idea is to model correlation of error terms within a group, and assume no correlation across groups. The formula is consistent if the number of groups gets large.
Clustersโ€™ number
  • Empirically 20.
  • if we have less clusters:
    • collect more data from more clusters
    • Use G - k - 1 degrees of freedom for the t distribution rather than standard normal distribution, where G is the number of clusters.
STATA command: reg Y X, cluster(gid) ,where gidis the group id which we assume the intra-class correlation exists.
Solution 2: Use Group Mean
Estimate:
by WLS using the group size as weights.
Generalize the method to models with microcovariates:
notion image
ย 
ย 

Loading Comments...