Random Number Generator

1. Generators
2. Seed selection
3. Testing Random Number Generators
4. Random Variate Generation

Reference book: The Art of Computer Systems Performance Analysis [pdf]

1. Generators

1.1. Linear congruent generators (LCG)

It uses a recursive relationship:

\begin{align*} x_n = (a x_{n-1} + b) \mod m \end{align*}

Mixed LCGs: \(b \neq 0\)
Multiplicative LCGs: \(b = 0\)
- Faster
- If modulo \(m\) is taken to be power of two, then module is just truncation and is very fast
  
  However, the period is smaller. And also lower order bits are not random.

Major challenge is integer overflow during multiplication. Schrage’s method, is a technique to perform these calculations without causing overflow.

1.2. Tausworthe Generators

These operate on binary digits (bits) using exclusive-OR (XOR) operations. They are useful for cryptographic applications and hardware implementations (Linear-Feedback Shift Registers). While they have good statistical properties generally, they may fail specific tests like "runs up and down".

1.3. Extended Fibonacci Generators

These use a recursive sum of two previous numbers (e.g., \(x_n = x_{n-1} + x_{n-2} \mod m\)). While the basic version has poor randomness, combining older values (e.g., the 5th and 17th most recent) can result in very long periods and good statistical properties.

1.4. Combined Generators

Because single generators often have weaknesses, analysts combine them (by adding or XOR-ing their results) to increase the period and randomness.

2. Seed selection

Don't use zero
Don't subdivide the same stream if you need multiple streams, because this causes correlation.

E.g. if you need one strem for arrival time and another for service time, don't use \(u_1, u_2\) instead choose seeds that are a million steps apart.
Don't use random seed (eg system clock) because then you can't reproduce your simulation.

3. Testing Random Number Generators

Testing is necessary, not sufficient.

3.1. Test of Uniformity

Chi-square test
- Group numbers into cells
- Compare observed frequency vs expected frequency
- Cons: Cell size changes results
Kolmogorov-Smirnov (K-S) Test
- Plot observed CDF vs expected CDF
- Get maximum vertical distance \(K^+\) and \(K^-\)
- Pros: No grouping like in Chi-square test

3.2. Test for independence

Serial-correlation test
- Check autocovariance at lag \(k\)
- It should be near zero for all lags
Two-level test
- Check if there are correlations in short cycles, using any method.

3.3. k-Dimensional Uniformity

Uniformity in 1D doesn't lead to uniformity in say pair of numbers in 2D, and so on.

Visual Check
- Plot and see
Serial test
- Extends Chi-square test to higher dimension.
- The groups are now hypercubes
Spectral test

4. Random Variate Generation

We can take uniform random numbers and transform them into specific distributions.

4.1. Inverse Transformation Method

If we can generate sample from uniform distribution, then to sample from \(f\) we just need to transform the samples by the inverse cdf of the probability density \(f\).

\begin{align*} X = F^{-1}(U) \end{align*}

4.2. Rejection Sampling

To sample from \(f\), first sample from a density \(g\) and only accept samples with probability \(f(x)/{cg(x)}\), and to make this expression lie in between 0 and 1, \(c\) is taken such that \(f(x)/g(x) \leq c\).

To easily visualize this, take \(g(x)\) as uniform distribution, and think what happens.

4.3. Composition

For mixture distributions

Generate a random number to select one of the simpler distributions based on their weights (probabilities).
Generate the final variate using the selected distribution

4.4. Convolution

If the desired distribution can be expressed as sum of multiple other independent random variables, you generate those RVs and sum them.

Erlang-k: Sum of \(k\) exponentials
Binomial: Sum of \(n\) bernoulli trials
Normal: Sum of large number of uniform random numbers