Multivariate Normal distribution


Story

This is a generalization of the univariate Normal distribution.


Example

Finch beaks are measured for beak depth and beak length. The resulting distribution of depths and length is Normal. In this case, the Normal is bivariate, with \(\boldsymbol{\mu} = (\mu_d, \mu_l)\) and the covariance matrix is

\[\begin{split}\begin{align} \mathsf{\Sigma} = \begin{pmatrix}\sigma_\mathrm{d}^2 & \sigma_\mathrm{dl} \\ \sigma_\mathrm{dl} & \sigma_\mathrm{l}^2\end{pmatrix}. \end{align}\end{split}\]

Parameters

There is one vector-valued parameter, \(\boldsymbol{\mu}\), and a matrix-valued parameter, \(\mathsf{\Sigma}\), which are location and scale parameters respectively. The matrix scale parameter is referred to as a covariance matrix. The covariance matrix is symmetric and strictly positive definite.


Support

The \(K\)-variate Normal distribution is supported on \(\mathbb{R}^K\).


Probability density function

\[\begin{align} f(\mathbf{y};\boldsymbol{\mu}, \mathsf{\Sigma}) = \frac{1}{\sqrt{(2\pi)^K \mathrm{det}\mathsf{\Sigma}}}\,\exp\left[-\frac{1}{2}(\mathbf{y} - \boldsymbol{\mu})^\mathsf{T} \cdot \mathsf{\Sigma}^{-1} \cdot (\mathbf{y} - \boldsymbol{\mu})\right]. \end{align}\]

Moments

Mean of \(y_i\): \(\mu_i\)

Variance of \(y_i\): \(\Sigma_{ii}\)

Covariance of \(y_i, y_j\) with \(j\ne i\): \(\Sigma_{ij}\)


Usage

The usage below assumes that mu is a length \(K\) array, Sigma is a \(K\times K\) symmetric positive definite matrix, and L is a \(K\times K\) lower-triangular matrix with strictly positive values on the diagonal that is a Cholesky factor.

Package

Syntax

NumPy

rg.multivariate_normal(mu, Sigma)

NumPy Cholesky

rg.multivariate_normal(mu, np.dot(L, L.T))

SciPy

scipy.stats.multivariate_normal(mu, Sigma)

SciPy Cholesky

scipy.stats.multivariate_normal(mu, np.dot(L, L.T))

Stan

multi_normal(mu, Sigma)

Stan Cholesky

multi_normal_cholesky(mu, L)



Notes

  • The covariance matrix may also be written as \(\mathsf{\Sigma} = \mathsf{S} \cdot \mathsf{C} \cdot \mathsf{S}\), where \(\mathsf{S} = \sqrt{\mathrm{diag}(\mathsf{\Sigma})}\), and entry \(i, j\) in the correlation matrix \(\mathsf{C}\) is \(C_{ij} = \sigma_{ij}/\sigma_i\sigma_j\).

  • Because \(\Sigma\) is symmetric and strictly positive definite, it can be uniquely defined in terms of its Cholesky decomposition, \(\mathsf{L}\), which satisfies \(\mathsf{\Sigma} = \mathsf{L}\cdot\mathsf{L}^\mathsf{T}\). In practice, you will almost always use the Cholesky representation of the Multivariate Normal distribution in Stan.