Beta distribution


Story

Say you wait for two multistep Poisson processes to arive. The individual steps of each process happen at the same rate, but the first multistep process requires \(\alpha\) steps and the second requires \(\beta\) steps. The fraction of the total waiting time taken by the first process is Beta distributed.


Parameters

There are two parameters, both strictly positive: \(\alpha\) and \(\beta\), defined in the above story.


Support

The Beta distribution has support on the interval [0, 1].


Probability density function

\[\begin{align} f(\theta; \alpha, \beta) = \frac{\theta^{\alpha-1}(1-\theta)^{\beta-1}}{B(\alpha, \beta)}, \end{align}\]

where

\[\begin{align} B(\alpha, \beta) = \frac{\Gamma(\alpha)\,\Gamma(\beta)}{\Gamma(\alpha + \beta)} \end{align}\]

is the Beta function.


Moments

Mean: \(\displaystyle{\frac{\alpha}{\alpha + \beta}}\)

Variance: \(\displaystyle{\frac{\alpha\beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}}\)


Usage

Package

Syntax

NumPy

rg.beta(alpha, beta)

SciPy

scipy.stats.beta(alpha, beta)

Stan

beta(alpha, beta)



Notes

  • The story of the Beta distribution is difficult to parse. Most importantly, the Beta distribution allows us to put probabilities on unknown probabilities. It is only defined on \(0 \le \theta \le 1\), and \(\theta\) here can be interpreted as a probability, say of success in a Bernoulli trial.

  • The case where \(\alpha = \beta = 0\) is not technically a probability distribution because the PDF cannot be normalized. Nonetheless, it is often used as an improper prior, and this prior is known a Haldane prior, names after biologist J. B. S. Haldane. The case where \(\alpha = \beta = 1/2\) is sometimes called a Jeffreys prior.

  • The Beta distribution may also be parametrized in terms of the location parameter \(\phi\) and concentration \(\kappa\), which are related to \(\alpha\) and \(\beta\) as

\[\begin{split}\begin{align} &\phi = \frac{\alpha}{\alpha + \beta}, \\ &\kappa = \alpha + \beta. \end{align}\end{split}\]

The location parameter \(\phi\) is the mean of the distribution and \(\kappa\) is a measure of how broad it is. To convert back to an \((\alpha, \beta)\) parametrization from a \((\phi, \kappa)\) parametrization, use

\[\begin{split}\begin{align} &\alpha = \phi \kappa, \\ &\beta = (1-\phi)\kappa. \end{align}\end{split}\]

PDF and CDF plots