Central Limit Theorem

Level 1 - Math II (Physics) topic page in Probability.

Principle

The central limit theorem says that sums and averages of many independent identically distributed variables are approximately normal under broad conditions. The individual observations do not need to be normally distributed.

Notation

\(X_1,X_2,\ldots,X_n\)

independent identically distributed random variables

\(\mu\)

common mean E[X_j]

\(\sigma^2\)

common variance Var(X_j)

\(\overline{X}\)

sample mean random variable

\(n\)

number of observations

\(Z\)

standard normal variable used for approximation

Method

Step 1: Define the average

The sample mean is the random variable formed by adding the observations and dividing by the sample size.

Sample mean

\[\overline{X}=\frac{1}{n}\sum_{j=1}^{n}X_j\]

Normal approximation

\[\overline{X}\approx N\left(\mu,\frac{\sigma^2}{n}\right)\]

Step 2: Standardise the sample mean

The standard deviation of \(\overline{X}\) is \(\sigma/\sqrt n\). Subtract the mean and divide by this standard deviation.

Exact mean

\[E[\overline{X}]=\mu\]

Exact variance

\[\operatorname{Var}(\overline{X})=\frac{\sigma^2}{n}\]

Exact standard deviation

\[\operatorname{SD}(\overline{X})=\frac{\sigma}{\sqrt n}\]

Standardised average

\[\frac{\overline{X}-\mu}{\sigma/\sqrt n}\approx Z\]

Rules

Sample mean

\[\overline{X}=\frac{1}{n}\sum_{j=1}^{n}X_j\]

CLT for average

\[\overline{X}\approx N\left(\mu,\frac{\sigma^2}{n}\right)\]

CLT standardisation

\[\frac{\overline{X}-\mu}{\sigma/\sqrt n}\approx N(0,1)\]

CLT for sum

\[\sum_{j=1}^{n}X_j\approx N(n\mu,n\sigma^2)\]

Examples

Question

A simulation adds

\[50\]

independent uniform random numbers. What shape should the sum have approximately?

Answer

The individual terms are uniform, but the sum of many independent identically distributed terms is approximately normal by the central limit theorem.

Checks

The central limit theorem concerns sums and averages, not each individual observation becoming normal.
The variables should be independent and identically distributed in this course statement.
The average has variance \(\sigma^2/n\), so its spread shrinks as \(n\) grows.
Standardise \(\overline{X}\) using \(\sigma/\sqrt n\).
Larger \(n\) usually improves the normal approximation.

Normal Approximation

Experimental Errors