🌊

OSP-2 Discrete Random Variables

Inverse CDF for Discrete Distributions

Suppose we want to generate from a discrete distribution
where
The inverse CDF method for this distribution is to return if
Finding the such that there inequalities hold true can be computationally inefficient. A while loop implementation is as below
import numpy as np def inv_cdf_seq_search(x, p): """ x: array of values taken by X p: array of corresponding probabilities """ U = np.random.rand(1) k = 0 S = p[0] while U > S: k = k + 1 S = S + p[k] return x[k]
Example: Poisson distribution
If , then
import numpy as np def inv_cdf_gen_poisson(_lambda): """ x: array of values taken by X p: array of corresponding probabilities """ U = np.random.rand(1) x = 0 P = np.exp(-_lambda) S = P while U > S: x += 1 P = _lambda * P / x S = S + P return x
For the inverse CDF method, the expected number of comparisons happening in the while loop is equal to , which can be large or even infinite for heavy-tailed random variables. To generate observations, the inverse CDF method requires number of comparisons on average.
By some clever pre-processing, one can reduce the number of comparisons significantly.

Guide Values

Suppose we have a discrete distribution
Let the CDF be given as a vector by
The inverse CDF method returns if . Equivalently, if .
Firstly, calculate the pre-computing guide values: fix
This non-decreasing sequence of () can be computed in time.
The guide values help with computing because
and hence,
where is the largest integer smaller than and is the smallest integer larger than .
So now we only need to go over with satisfying
to find the right such that .
It can be proved that the expected number of comparisons to be made here is bounded above by 2, no matter what the distribution is and no matter how large is.
Example
import numpy as np import pandas as pd def inv_cdf_guide_table(q, g): """ q: CDF of the discrete distribution g: pre-calculated guide values """ K_plus_1 = len(g) - 1 U = np.random.rand(1) I = int(np.floor(K_plus_1 * U) + 1) X = g[I] while q[X] > U: X = X - 1 return X p = [0.25, 0.35, 0.40] # V = 0, 1, 2 (K = 2) q = [0, 0.25, 0.60, 1.00] # CDF, len = 1 + len(p) = 1 + K+1 = K + 2 g = [-1, 1, 2, 2] # F^{-1}(i/(K+1)) for i = 0, 1, 2, and g_3 = 2 X = np.zeros(10000) for i in range(10000): X[i] = inv_cdf_guide_table(q, g) data = pd.Series(X) print(data.value_counts().sort_index())

Alias Table

Theorem: Every probability vector can be expressed as an equi-probable mixture of two-point distributions (each distribution has the same β€˜probability’). That is, there exists probability vectors such that
and have at most two non-zero entries.
Brief proof of the theorem:
Pick and such that
Note that otherwise all ’s are larger than which implies
Define , a probability distribution on such that
Then
where is another probability vector satisfying . Furthermore, is a discrete distribution supported on the set which has possible values.
We thus eliminate the problem form values probability into a values probability. Repeat this process on , to obtain , and so on.
Construction of Alias Table
Operationally, we can start with and keep decomposing it by looking at the largest and smallest probabilities.
Example, consider
Decomposition
Each step above yields one row for the Alias table of below
(Bernoulli) Prob
Value (position)
Alias (position)
4/24
1
2
12/24
0
3
12/24
3
4
16/24
2
4
As a check, note that
Generation from the Alias Table
The generation from the alias table proceeds as follows:
  • Generate and set (simulation of the equal probabilities, i.e. the uniform mixture)
  • Generate . If , return else, return
import numpy as np import numpy.random as npr import pandas as pd def alias_setup(probs): K = len(probs) q = np.zeros(K) J = np.zeros(K, dtype=int) # divide probabilities into larger and smaller than 1/K smaller = [] larger = [] for kk, prob in enumerate(probs): q[kk] = K * prob if q[kk] < 1.0: smaller.append(kk) else: larger.append(kk) # loop through and create binary mixtures while len(smaller) > 0 and len(larger) > 0: small = smaller.pop() large = larger.pop() J[small] = large q[large] = q[large] - (1.0 - q[small]) if q[large] < 1.0: smaller.append(large) else: larger.append(large) return J, q def alias_draw(J, q): # can be draws in O(1) time K = len(J) # draw from the overall uniform mixture kk = int(np.floor(npr.rand() * K)) # draw from the bernoulli distribution, either keeping the small one # or choosing the associated larger one if npr.rand() < q[kk]: return kk else: return J[kk] K, N = 5, 1000 probs = [3/24, 1/24, 9/24, 6/24, 5/24] J, q = alias_setup(probs) X = np.zeros(N) for nn in range(N): X[nn] = alias_draw(J, q) data = pd.Series(X) print(data.value_counts().sort_index())
Note that in the Python implementation, when setting up the Alias table, each time we do not select the global . But we can still ensure that there are unique β€œsmaller” Value and corresponding Alias (might duplicated), and it only takes time to construct the table. We just save the time of sorting and finding the max or min each time.
For the example above, the Alias table is as follows: (we have here for eaiser implementation)
(Bernoulli) Prob
Value (position)
Alias (position)
0.625
0
2
0.208
1
4
1.000
2
0
0.500
3
2
0.250
4
3
As a check, note that

Semi-Markov Process

The process is denoted by , with a finite stat space given by represent the state of a random process at any particular time .
The process evolves are follows:
(1) there is a starting stat, say
(2) the process stays at state for a random time which follows distribution
(3) then jumps from state to another state with probability for (cannot go back)
(4) keep repeating (2) ~ (3)
Hence, there are two fundamental quantities to describe the continuous time semi-Markov process:
  • holding time distributions , one for each state
  • is an matrix where gives the probability that if the process is in state , and the process next moves to state . The is called the instantaneous probability matrix.
Note that and .
One of the major task is to find , where
That is, asking the probability that we end up in state at time if we start in state at time 0.
If the hold time distributions are exponential, then Chapman-Kolmogorov theory implies that
for a matrix known as the infinitesimal generator. This exponential can be computed using scipy.linalg.expm(Q*t)
Simulation of Semi-Markov process
Fix a state and suppose . To find , set count = 0 and loop idx from 1 to 1000 (simulation rounds),
  • generate (using the inverse CDF or some other method given the distribution)
  • generate a discrete random variable (next state) given the probabilities (using guide values or alias table)
  • generate , and
  • generate a discrete random variable (next state) given the probabilities (using guide values or alias table)
  • repeating above process to get
  • Stop if
    • set count += 1 if we are at state when stop
    • else, do not change count
The estimate of is , and the standard error of the estimate is

Loading Comments...