Multinomial distribution¶
Story¶
This is a generalization of the Binomial distribution. Instead of a Bernoulli trial consisting of two outcomes, each trial has \(K\) outcomes. The probability of getting \(y_1\) of outcome 1, \(y_2\) of outcome 2, …, and \(y_K\) of outcome \(K\) out of a total of \(N\) trials is Multinomially distributed.
Example¶
There are two alleles in a population, A and a. Each individual may have genotype AA, Aa, or aa. The probability distribution describing having \(y_1\) AA individuals, \(y_2\) Aa individuals, and \(y_3\) aa individuals in a population of \(N\) total individuals is Multinomially distributed.
Parameters¶
\(N\), the total number of trials, and \(\boldsymbol{\theta} = \left\{\theta_1, \theta_2, \ldots,\theta_K\right\}\), the probabilities of each outcome. Note that \(\sum_{i=1}^K \theta_i = 1\) and there is the further restriction that \(N = \sum_{i=1}^K y_i\).
Support¶
The \(K\)nomial distribution is supported on \(\mathbb{N}^K\).
Probability mass function¶
Moments¶
Mean of \(y_i\): \(N\theta_i\)
Variance of \(y_i\): \(N\theta_i(1\theta_i)\)
Covariance of \(y_i, y_j\) with \(j\ne i\): \(N\theta_i\theta_j\)
Usage¶
The usage below assumes theta
is a length \(K\) array.
Package 
Syntax 

NumPy 

SciPy 

Stan sampling 

Stan rng 

Notes¶
For a sampling statement in Stan, the value of \(N\) is implied.