FDSI-5 Model Selection

Intro

The idea of MLE is to maximize the likelihood based on observed data. When we try to compare different models, we can also compare their respective likelihood as long as the data being fit remains exactly the same.

However, if we only seek to maximize the likelihood over candidate models, we will be led to overfitting to the sample, since a model with more complexity (more parameters, more flexibility), will necessarily fit better to the data.

Instead, we should also emphasize other factors, like the tails distribution, the AIC, etc.

Alternatives to Lognormal Pricing Models

Let denote the closing price of some equity on day .

Some theoretical results in finance assume the lognormal pricing model, in which , where has the lognormal() distribution, i.e. . It is further assumed that are independent.

This model is, however, known to not fit well to real data. In particular, log returns have heavier tails than the normal distribution.

The Generalized Error Distribution (GED)

(Also being called generalized normal distribution.)

Random variable has the GED with mean , scale , and shape if

where .

Remarks:

When , this is . When , this is the double exponential or Laplace distribution.

The distribution are symmetric around

Package scipy.stats includes gennorm

The Nonstandard t-distribution

Assume has the (standard) t-distribution with degrees of freedom. Then define

where .

This distribution have the “shape” of the t-distribution, but will have mean and scale .

Remarks:

Skewed Generalized t Distribution

(From Theodossiou, “Financial Data and the Skewed Generalized T Distribution”.)

This distribution always has its mode (pdf peak) at zero, but can be skewed (asymmetric).

There are four parameters:

, the variance,

, controls the height at the peak,

, controls the probability in the tails,

, controls the amount of skew,

When and , this matches the nonstandard t distribution with degrees of freedom, variance , and .

AIC

Models with more parameters have greater flexibility to fit to features, but also artifacts, in observed data. Hence, the risk of overfitting increases greatly as we consider models with a larger number of parameters (and more complexity).

Akaike (1973) proposed s simple way to balance likelihood and model complexity, and this has proven to be a very valuable tool for model selection. Define

where is the number of parameters in the model. We seek the model that has the smallest value of AIC.