🌊

OSP-1 Continuous Random Variables

Simulation of Uniform Random Variables

Almost all simulation methods start with access to uniform random variables. In general, no truly random numbers can be generated by a deterministic working computer, but only pseudo-random numbers.
Linear Congruential Generators (LCGs)
Start with (called seed) and set
for some constants . Return .
This sequence starts repeating itself after some , which is called the period. We want the period to be as large as possible. For example, the Super-Duper generator from the 70’s uses and has a period of .
If , the algorithm is called multiplicative congruential generators (MCGs).
If with , and is 3 or 5, and is odd, then the MCG has a period of .

Inverse CDF Method

The method is based on the fact that if is a CDF (continuous or not), and , then , brief proof as below
If is a continuous function, the inverse is well-defined. For a general non-decreasing function, the inverse is defined as
page icon
Pseudo-code to generate given
  • Generate
  • Return
Examples
(1) Exponential Random Variables
If has an exponential distribution with mean , then
this is used to model waiting times.
The CDF is
Inverse CDF method to generate :
(2) Cauchy Random Variables
If has a standard Cauchy distribution, then
This is a heavy tailed random variable with non-existent mean.
The CDF is
Inverse CDF method to generate :
(3) Pareto Distribution
A random variable is said to have a Pareto distribution with scale parameter and shape parameter , if
is a heavy tailed non-negative random variable with for .
The CDF is
Inverse CDF method to generate :
(4) Bernoulli Distribution
A random variable has a distribution if
The CDF is
Inverse CDF method to generate :
We can almost surly write (i.e. the largest integer below ).
For the special case of , we have . Since
and is the first decimal bit in the binary expansion of . The binary expansion of is
where .
Note that are i.i.d random variables. Therefore, we can generate infinitely many indenpendet random variables from a single uniform random variable . This is much more efficient compared to the slower way to firstly generate different uniform random variables and then calculate .
(5) Binomial Distribution
A random variable has a distribution if
For large , the inverse CDF method is computationally intensive but we can use the sum of independent random variables,
And if , we don’t even need to generate different .
(6) Generalized Lambda Distribution
The generalized Lambda distribution is defined via the inverse CDF as
where are parameters representing:
  • is the center
  • is the scale parameter
  • and are used to fit the skewness and kurtosis
The distribution is symmetric is .
This distribution can approximate different continuous distributions well
Distribution
Normal
(0, 0.1975, 0.1349, 0.1349)
0.0028
0.0011
Uniform
(0.5, 2, 1, 1)
0
0
Student’s t
(0, -0.2481, -0.1359, -0.1359)
0.0358
0.0148
Chi-square
(2.604, 0.0176, 0.0095, 0.0542)
0.0211
0.0136
Gamma
(10.762, 0.0144, 0.0252, 0.0939)
0.005
0.012
Weibul
(0.9935, 1.0488, 0.2121, 0.1016)
0.0354
0.0027
Lognormal
(0.8451, 0.1085, 0.0102, 0.0342)
0.0953
0.0123
Beta
(0.5, 1.9693, 0.4495, 0.4495)
0.009
0.0004

Accept-Reject Method

Sometimes the inverse CDF might be expensive to compute or does not exist. The basic idea of accept-reject method is to simulate from an easier distribution and use the generated random variable if it satisfies certain criteria. This method is applicable for all distributions, here only consider continuous distributions.
We aim to generate samples of the random variable from a distribution with pdf . Let be another density function and a constant such that
  1. majorizes , i.e., for all
  1. there is an (easy) way to generate samples for a random variable with pdf
The first condition implies that
  • whenever , i.e., the support of is contained in the support of .
  • , proof as below
page icon
To generate with pdf :
  • Generate from the pdf
  • Generate from
  • Return , if . Otherwise, return to step 1
import numpy as np def reject_samp_generate_f(f, g, c, gernerate_g, n): I = np.zeros(n) x = np.zeros(n) for i in range(n): while I[i] == 0: x[i] = generate_g() tmp = c * g(x[i]) if tmp * np.random.rand(1) <= f(x[i]): I[i] = 1 return x
Success Probability
Probability of accepting as is given by
Hence, the number of samples that need to be generated from to get one acceptance is a Geometric random variable with success probability . And thus the expected number of samples that need to generated from to get one acceptance is .
Therefore, the constant determines the efficiency of the accept-reject algorithm. The smaller the , the more efficient the sampling strategy.
Proof of Correctness
Hence .
Examples
(1) Beta Distribution
Suppose we want to generate samples from distribution which does not have a closed form CDF expression. We can use the uniform distribution as proxy.
We need such that for all . This is the same as
Hence, we need to generate about 21094 samples to obtain 10000 samples from distribution.
(2) Normal Distribution
Suppose we want to generate samples from . Since the normal CDF does not have a closed form expression, inverse CDF method is difficult to apply. We can use the Cauchy distribution as proxy.
We need such that for all . This is the same as
Hence, we need to generate about 15202 samples from Cauchy distribution to get 10000 samples from the normal distribution.

The Composition Method

The composition method can be sometimes useful when the density itself is not known in closed form.
A density is said to be a discrete composition density if there exists probabilities and densities such that
For example,
A density is said to be a continuous composition density if there exists a density and a family of densities such that
For example,
We can rewrite the density function as
If we have a discrete composition density , then we can generate an observation form as below:
  • Generate a discrete random variable such that
  • Given , generate a random variable from density

Loading Comments...