Processing math: 100%

20110513

Correlation, mean-squared-error, mutual information, and signal-to-noise ratio for Gaussian random variables

I was reading a paper, and encountered a figure that showed the correlation, mutual information, and mean-squared prediction error, for a pair of time-series. This seemed a bit redundant. It turns out it was added to the paper on the request of a reviewer. If your data are jointly Gaussian, these all measure the same thing; no need to clutter a figure by showing all of them.

For a jointly Gaussian pair of random variables, correlation, root mean squared error, correlation, and signal to noise ratio, are all equivalent and can be computed from each-other. 

Some identities

Consider two time series x and y that can be well-approximated as jointly Gaussian. To simplify things, let x and y have zero mean and unit variance (the math still works out without this assumption, but its also easy to ensure by z-scoring the data). Also, let n be a zero-mean unit-variance Gaussian random variable that captures noise, i.e. fluctuation in y that cannot be explained by x.

Let's say we're interested in a linear relationship between x and y:

y=ax+bn.

The linear dependence of y on x is summarized by a single parameter

Since the signal and noise are independent, their variances combine linearly:

σ2y=a2σ2x+b2σ2n.

The sum a2+b2 is constrained by the variances in x, y, and n. In this example we've assumed these are all 1, so

a2+b2=1.

Incorporate this constraint by defining α=a2 and writing 

σ2y=ασ2x+(1α)σ2n

and

y=xα+n1α.

(We'll show later that α is the squared Pearson correlation coefficient, i.e. it is the coefficient of determination.)

From this the signal-to-noise ratio and mutual information can be calculated

The Signal-to-Noise Ratio (SNR) is the ratio of the signal and noise contributions to x, and simplifies as

SNR=σ2axσ2bn=ασ2x(1α)σ2n=α1α.

On jointly Gaussian channels mutual information I (in bits, is using log2) is a  monotonic function of SNR, and simplifies as:

I=12log2(1+SNR)=12log2σ2yσ2bn=12log2σ2y(1α)σ2n=12log211α.

Relationship between a, b, alpha, and Pearson correlation ρ

Since x and n are independent, the samples of x and n can be viewed as an orthonormal basis for the samples of y, with weights a and b, respectively. This relates the gain parameters to correlation: the tangent of the angle between y and x is just ratio of the noise gain b to the signal gain a:

tan(θ)=ba=1αα

Then, tan(θ) can be expressed in terms of the correlation coefficient ρ:

tan(θ)=sin(θ)cos(θ)=1cos(θ)2cos(θ)=1ρ2ρ

This implies that

1αα=1ρ2ρ,

which implies that that α=ρ2, i.e. a=ρ

A few more identities

This can be used to relate correlation ρ to SNR and mutual information:

SNR=ρ21ρ2 

I=12log211ρ2=12log2(1ρ2)

If ϕ=1ρ2 is the correlation of y and the noise n (i.e. ϕ is the amplitude of the noise contribution to y), then information is simply I=log2(ϕ)

Mean squared error (MSE) is also related :

MSE=(1ρ)2+(1ρ2)=12ρ+1=2(1ρ),

which implies that

ρ=112MSE,

and gives a relationship between mutual information and mean squared error:

I=12lg(1ρ2)=12log2(1(1MSE/2)2)

These relationships between correlation ρ, mean squared error, mutual information, and signal to noise ratio, all increase monotonically. They all summarize the relatedness of x and y. For purposes, e.g. of ranking a collection of x in terms of how much they tell us about y, they are equivalent.