Techniques
Basic Idea
The basic idea of importance sampling is as follows: suppose we want to estimate , when has pdf , then
where .
Here is an intuitive example: suppose and is the density of . A random variable has 99.9% probability of smaller than 3, i.e. 99.9% of the observations generated contribute nothing to the Monte Carlo estimator. We can thus apply the Importance Sampling technique:
- take to be the density of for some , so that is non-zero with a much higher change
- then
Β
The importance sampling is motivated by similar contexts, for example,
- in several contexts, standard simulations are very inefficient because most paths generated offer litter information about the quantity of interest
- when payoff functionsβ threshold is too large or small
- in risk management, VaR and CVaR
Optimal Choice
The best choice for would be the one that minimizes the variance.
- If is a non-negative function (e.g. payoff), taking , we get
- If are not non-negative, the optimal density is
For the first situation, however, we do not know . But above analysis gives up an indication of what type of densities we should be looking for. We should use densities that are proportional to .
For example, if we want to estimate , then we should use an importance sampling density that is zero whenever .
The standard Monte Carlo estimator is , with . The importance sampling estimator is
The difference in the variance is
Variance is reduced if this integral is positive, and we need
- if is small
- if is large
i.e.
We know that the best possible is not actionable in practice, we can restrict to a parameterized family of densities and find the best choice within that family, i.e. look over for decent variance reduction.
Importance Sampling with Tilted Densities
Suppose is a light-tailed distribution so that it has a finite moment generating function (MGF), i.e. for some .
For , a (exponential) tilted density of is given by
This is a density because and
Characteristics:
- if we want to sample more often from the region where tends to be large (and positive), then we can use with as the importance sampling density
- if we want to sample more often from the region where tends to be large (and negative), then we can use with as the importance sampling density
There are several ways to find the best possible within a parameterized family, here are two of them: the Maximum Principle and the Weighted Least Squares Principle.
The Maximum Principle
Although we cannot choose because we donβt know , we can choose such that has shape similar to . In particular, we can choose such that and both take on their maximum values at the same value , that is,
For example, for the and , we have
that is is the density function of therefore . Given , . Then
Another example: Standard Call Option
Suppose we want to estimate where . The standard normal distribution variable has MGF , hence, the tilted density is
which is the density of . And . Therefore, according to the maximum principle,
With , this is the solution to the equation and . The equation for the standard call option can be solved using the bisection method.
Remarks of the maximum principle
- This principle can be applied to any family of parameterized densities
- Sometimes applying the maximum principle can be difficult. For example, if may be the case that there are multiple or even infinitely many solutions to
- The maximum principle is not guaranteed to yield a variance reduction
Weighted Least Squares Principle
Consider the importance sampling estimator
The basic idea of WLS principle is to estimate the variances of and find a to minimize those estimated variances:
The variance is
therefore,
Examples
Example (1): Ruin Probabilities
Consider an insurance firm earning premiums at a constant rate per unit of time and paying claims that arrive at the jumps of a Poisson process with rate . Let denote the number of claims arriving in and the size of the -th claim, , the net payout of the firm over is given by .
Suppose the firm has a reserve of , then ruin occurs if the net payout ever exceeds . If ever ruin occurs, it must occur at the arrival of a claim and thus it suffices to consider the discrete-time process.
Let denote the interarrival times of the Poisson process (exponential with mean ). The net payout between the -th payout and -th claim is . Net payout is
Β
In this notation, ruin occurs at
The probability of eventual ruin is . The standard Monte Carlo requires numberous samples to estimate this probability because .
We want to use importance sampling by using density such that so that the ruin occurs with probability 1.
where
and thus
Therefore
Since , then and thus
where .
After choosing the optimal based on some principle, the implementation of this importance sampling estimator is:
- For -th replication, generate , β¦. iid from until . Let be the first at which .
- Return the estimator
Example (2): Knock-in option
The option is a digital knock-in option with payoff
with , and is the underlying asset, is the strike price, is the barrier.
If is much smaller than or is much larger than , most paths would result in payoff of zero. Importance sampling can help.
Suppose the underlying asset process can be modeled as
with i.i.d. and .
The payoff can thus be written in terms of a stopping time:
where is the first such that .
If we make a single change of measure as in the usual importance sampling method, we could get with higher probability, but this reduces the probability of crossing over .
On the other hand, if we make a change to get with high probability, then this reduces the probability of .
We can then use two different tilts of the density for and (Boyle et al. 1997)
for some .
Example (3): CVaR
The VaR is defined as
. Typically, is large (0.95 or 0.99). Standard Monte Carlo samples are wasteful as only of the samples are used.
Note that
Estimation of and can be done using importance sampling.
Estimation of standard errors in this case is complicated. It is easier to just compute the estimator based on 10 different replications and compute the standard error.
Loading Comments...