" Its never too late "

Monthly update

Shub welcomes feedback, questions, and ideas.

Monday, May 2, 2011

Correlation and Covariance of a Random Signal


Covariance
To begin with, when dealing with more than one random process, it should be obvious that it would be nice to be able to have a number that could quickly give us an idea of how similar the processes are. To do this, we use the covariance, which is analogous to the variance of a single variable.
DEFINITION 1: Covariance
A measure of how much the deviations of two or more variables or processes match.
For two processes, X and Y, if they are not closely related then the covariance will be small, and if they are similar then the covariance will be large. Let us clarify this statement by describing what we mean by "related" and "similar." Two processes are "closely related" if their distribution spreads are almost equal and they are around the same, or a very slightly different, mean.
Mathematically, covariance is often written as σxy and is defined as
cov(X,Y)==σxyE[(XX−−−)(YY−−)]
(1)
This can also be reduced and rewritten in the following two forms:
σxy=(xy)−−−−−x−−y−−
(2)
σxy=∞−∞∞−∞(XX−−−)(YY−−)f(x,y)dxdy
(3)
Useful Properties
·         If X and Y are independent and uncorrelated or one of them has zero mean value, then
σxy=0
·         If X and Y are orthogonal, then
σxy=−(E[X]E[Y])
·         The covariance is symmetric
cov(X,Y)=cov(Y,X)
·         Basic covariance identity
cov(X+Y,Z)=cov(X,Z)+cov(Y,Z)
·         Covariance of equal variables
cov(X,X)=Var(X)
Correlation
For anyone who has any kind of statistical background, you should be able to see that the idea of dependence/independence among variables and signals plays an important role when dealing with random processes. Because of this, the correlation of two variables provides us with a measure of how the two variables affect one another.
DEFINITION 2: Correlation
A measure of how much one random variable depends upon the other.
This measure of association between the variables will provide us with a clue as to how well the value of one variable can be predicted from the value of the other. The correlation is equal to the average of the product of two random variables and is defined as
cor(X,Y)==E[XY]∞−∞∞−∞xyf(x,y)dxdy
(4)
Correlation Coefficient
It is often useful to express the correlation of random variables with a range of numbers, like a percentage. For a given set of variables, we use the correlation coefficient to give us the linear relationship between our variables. The correlation coefficient of two variables is defined in terms of their covariance and standard deviations, denoted by σx, as seen below
ρ=cov(X,Y)σxσy
(5)
where we will always have
-1≤ρ≤1
This provides us with a quick and easy way to view the correlation between our variables. If there is no relationship between the variables then the correlation coefficient will be zero and if there is a perfect positive match it will be one. If there is a perfect inverse relationship, where one set of variables increases while the other decreases, then the correlation coefficient will be negative one. This type of correlation is often referred to more specifically as the Pearson's Correlation Coefficient,or Pearson's Product Moment Correlation.
(a) Positive Correlation
(b) Negative Correlation
(c) Uncorrelated (No Correlation)
Figure 1(a) (corr_inc.png)
Figure 1(b) (corr_dec.png)
Figure 1(c) (corr_unc.png)
Figure 1: Types of Correlation
NOTE: 
So far we have dealt with correlation simply as a number relating the relationship between any two variables. However, since our goal will be to relate random processes to each other, which deals with signals as a function of time, we will want to continue this study by looking at correlation functions.
Example
Now let us take just a second to look at a simple example that involves calculating the covariance and correlation of two sets of random numbers. We are given the following data sets:
x={3,1,6,3,4}
y={1,5,3,4,3}
To begin with, for the covariance we will need to find the expected value, or mean, of x and y.
x−−=15(3+1+6+3+4)=3.4
y−−=15(1+5+3+4+3)=3.2
xy−−−=15(3+5+18+12+12)=10
Next we will solve for the standard deviations of our two sets using the formula below (for a review click here).
σ=E[(XE[X])2]−−−−−−−−−−−−−
σx=15(0.16+5.76+6.76+0.16+0.36)−−−−−−−−−−−−−−−−−−−−−−−−−−−=1.625
σy=16(4.84+3.24+0.04+0.64+0.04)−−−−−−−−−−−−−−−−−−−−−−−−−−−=1.327
Now we can finally calculate the covariance using one of the two formulas found above. Since we calculated the three means, we will use that formula since it will be much simpler.
σxy=10−3.4×3.2=-0.88
And for our last calculation, we will solve for the correlation coefficient, ρ.
ρ=-0.881.625×1.327=

No comments:

Post a Comment