Random Number Generator
Table of Contents
Reference book: The Art of Computer Systems Performance Analysis [pdf]
1. Generators
1.1. Linear congruent generators (LCG)
It uses a recursive relationship:
\begin{align*} x_n = (a x_{n-1} + b) \mod m \end{align*}- Mixed LCGs: \(b \neq 0\)
- Multiplicative LCGs: \(b = 0\)
- Faster
If modulo \(m\) is taken to be power of two, then module is just truncation and is very fast
However, the period is smaller. And also lower order bits are not random.
Major challenge is integer overflow during multiplication. Schrage’s method, is a technique to perform these calculations without causing overflow.
1.2. Tausworthe Generators
These operate on binary digits (bits) using exclusive-OR (XOR) operations. They are useful for cryptographic applications and hardware implementations (Linear-Feedback Shift Registers). While they have good statistical properties generally, they may fail specific tests like "runs up and down".
1.3. Extended Fibonacci Generators
These use a recursive sum of two previous numbers (e.g., \(x_n = x_{n-1} + x_{n-2} \mod m\)). While the basic version has poor randomness, combining older values (e.g., the 5th and 17th most recent) can result in very long periods and good statistical properties.
1.4. Combined Generators
Because single generators often have weaknesses, analysts combine them (by adding or XOR-ing their results) to increase the period and randomness.
2. Seed selection
- Don't use zero
Don't subdivide the same stream if you need multiple streams, because this causes correlation.
E.g. if you need one strem for arrival time and another for service time, don't use \(u_1, u_2\) instead choose seeds that are a million steps apart.
- Don't use random seed (eg system clock) because then you can't reproduce your simulation.
3. Testing Random Number Generators
Testing is necessary, not sufficient.
3.1. Test of Uniformity
- Chi-square test
- Group numbers into cells
- Compare observed frequency vs expected frequency
- Cons: Cell size changes results
- Kolmogorov-Smirnov (K-S) Test
- Plot observed CDF vs expected CDF
- Get maximum vertical distance \(K^+\) and \(K^-\)
- Pros: No grouping like in Chi-square test
3.2. Test for independence
- Serial-correlation test
- Check autocovariance at lag \(k\)
- It should be near zero for all lags
- Two-level test
- Check if there are correlations in short cycles, using any method.
3.3. k-Dimensional Uniformity
Uniformity in 1D doesn't lead to uniformity in say pair of numbers in 2D, and so on.
- Visual Check
- Plot and see
- Serial test
- Extends Chi-square test to higher dimension.
- The groups are now hypercubes
- Spectral test
4. Random Variate Generation
We can take uniform random numbers and transform them into specific distributions.
4.1. Inverse Transformation Method
If we can generate sample from uniform distribution, then to sample from \(f\) we just need to transform the samples by the inverse cdf of the probability density \(f\).
\begin{align*} X = F^{-1}(U) \end{align*}4.2. Rejection Sampling
To sample from \(f\), first sample from a density \(g\) and only accept samples with probability \(f(x)/{cg(x)}\), and to make this expression lie in between 0 and 1, \(c\) is taken such that \(f(x)/g(x) \leq c\).
To easily visualize this, take \(g(x)\) as uniform distribution, and think what happens.
4.3. Composition
For mixture distributions
- Generate a random number to select one of the simpler distributions based on their weights (probabilities).
- Generate the final variate using the selected distribution
4.4. Convolution
If the desired distribution can be expressed as sum of multiple other independent random variables, you generate those RVs and sum them.
- Erlang-k: Sum of \(k\) exponentials
- Binomial: Sum of \(n\) bernoulli trials
- Normal: Sum of large number of uniform random numbers