⚙: Correlation, mean-squared-error, mutual information, and signal-to-noise ratio for Gaussian random variables

I was reading a paper, and encountered a figure that showed the correlation, mutual information, and mean-squared prediction error, for a pair of time-series. This seemed a bit redundant. It turns out it was added to the paper on the request of a reviewer. If your data are jointly Gaussian, these all measure the same thing; no need to clutter a figure by showing all of them.

For a jointly Gaussian pair of random variables, correlation, root mean squared error, correlation, and signal to noise ratio, are all equivalent and can be computed from each-other.

Some identities

Consider two time series $x$ and $y$ that can be well-approximated as jointly Gaussian. To simplify things, let $x$ and $y$ have zero mean and unit variance (the math still works out without this assumption, but its also easy to ensure by z-scoring the data). Also, let $n$ be a zero-mean unit-variance Gaussian random variable that captures noise, i.e. fluctuation in $y$ that cannot be explained by $x$.

Let's say we're interested in a linear relationship between $x$ and $y$:

\[y = ax + bn.\]

The linear dependence of $y$ on $x$ is summarized by a single parameter

Since the signal and noise are independent, their variances combine linearly:

\[\sigma^2_{y} = a^2 \sigma^2_{x} + b^2 \sigma^2_{n}.\]

The sum $a^2+b^2$ is constrained by the variances in $x$, $y$, and $n$. In this example we've assumed these are all 1, so

\[a^2+b^2=1.\]

Incorporate this constraint by defining $\alpha=a^2$ and writing

\[\sigma^2_{y} = \alpha \sigma^2_{x} + (1-\alpha) \sigma^2_{n}\]

and

\[y = x\sqrt{\alpha} + n\sqrt{1-\alpha}.\]

(We'll show later that $\alpha$ is the squared Pearson correlation coefficient, i.e. it is the coefficient of determination.)

From this the signal-to-noise ratio and mutual information can be calculated

The Signal-to-Noise Ratio (SNR) is the ratio of the signal and noise contributions to $x$, and simplifies as

\[\text{SNR}=\frac{\sigma^2_{a x}}{\sigma^2_{b n}}=\frac{\alpha \sigma^2_x}{(1-\alpha) \sigma^2_n}=\frac{\alpha}{1-\alpha}.\]

On jointly Gaussian channels mutual information $I$ (in bits, is using $\log_2$) is a monotonic function of SNR, and simplifies as:

\[I=\frac{1}{2}\log_2(1+\text{SNR})=\frac{1}{2}\log_2{\frac{\sigma^2_y}{\sigma^2_{b n}}}=\frac{1}{2}\log_2{\frac{\sigma^2_y}{(1-\alpha)\sigma^2_n}}=\frac{1}{2}\log_2{\frac{1}{1-\alpha}}.\]

Relationship between $a$, $b$, $alpha$, and Pearson correlation $\rho$

Since $x$ and $n$ are independent, the samples of $x$ and $n$ can be viewed as an orthonormal basis for the samples of $y$, with weights $a$ and $b$, respectively. This relates the gain parameters to correlation: the tangent of the angle between $y$ and $x$ is just ratio of the noise gain $b$ to the signal gain $a$:

\[\tan(\theta)=\frac{b}{a}=\frac{\sqrt{1-\alpha}}{\sqrt{\alpha}}\]

Then, $\tan(\theta)$ can be expressed in terms of the correlation coefficient $\rho$:

\[tan(\theta)=\frac{\sin(\theta)}{\cos(\theta)}=\frac{\sqrt{1-\cos(\theta)^2}}{\cos(\theta)}=\frac{\sqrt{1-\rho^2}}{\rho}\]

This implies that

\[\frac{\sqrt{1-\alpha}}{\sqrt{\alpha}}=\frac{\sqrt{1-\rho^2}}{\rho},\]

which implies that that $\alpha=\rho^2$, i.e. $a=\rho$.

A few more identities

This can be used to relate correlation $\rho$ to SNR and mutual information:

\[\text{SNR}=\frac{\rho^2}{1-\rho^2}\]

\[I=\frac{1}{2}\log_2{\frac{1}{1-\rho^2}}=-\frac{1}{2}\log_2(1-\rho^2)\]

If $\phi=\sqrt{1-\rho^2}$ is the correlation of $y$ and the noise $n$ (i.e. $\phi$ is the amplitude of the noise contribution to $y$), then information is simply $I=-\log_2(\phi)$.

Mean squared error (MSE) is also related :

\[\text{MSE}=(1-\rho)^2+(1-\rho^2)=1-2\rho+1=2(1-\rho),\]

which implies that

\[\rho=1-\frac{1}{2}\text{MSE,}\]

and gives a relationship between mutual information and mean squared error:

\[I=-\frac{1}{2}lg(1-\rho^2)=-\frac{1}{2}\log_2(1-(1-\text{MSE}/2)^2)\]

These relationships between correlation $\rho$, mean squared error, mutual information, and signal to noise ratio, all increase monotonically. They all summarize the relatedness of $x$ and $y$. For purposes, e.g. of ranking a collection of $x$ in terms of how much they tell us about $y$, they are equivalent.

⚙

20110513