Combining variances

Independence
Think of two sets of numbers: one for each variable. When the two variables are independent, that means that any of the numbers in one set could be combined with any of the numbers in the other set.

On the other hand, if the two variables are not independent, then each number in one set must be combined with a specific one in the other set.

Adding independent variables
When we add two independent variables to make a new variable:

the variance of the new variable is equal to the sum of the variances of the two original variables:

This is probably the most important statement in the whole of statistics. We will now prove it.

We start by defining the combination of variables:

In this, we have two sets of numbers, x and y. They have different subscripts (i,j) because the two sets of numbers are not linked at all – they are independent.


To calculate the variance of z, we use the deviations of z, which are:
and which gives us the formula for the variance of z, exactly as defined previously, using the sum of squared deviations of z:

We can expand this formula:

then we do the summation over j:

In this expression, the middle term includs the sum of deviations of y. Since the sum of deviations is always zero (here), we can remove that middle term completely:

We can simplify to this:

which is equivalent to this:

QED



Adding Dependent Variables
If two variables are related, then the rule that you add variances does not apply. In this section, we will develop a new rule.

This time, the two variables have the same subscript (i) to indicate that they are not independent

We are going to show that, the variance of z is given by:

We start with the simple formula for the variance:

We expand it:

and we deal with the summation over i

This time, the middle term is not zero, and so doesn’t drop out. Instead let’s call it the covariance of x and y, because its formula looks like a variance, but for two variables

That means that we can write the formula for the variance in this case as:

Covariance
Covariance is a label we have used for a new quantity that appeared in the maths above. Here is a simple explanation of what it is.

In words, covariance is how much the two variables co-vary, ie. vary together.

  • If two variables are independent, then their covariance is zero.
  • If two variables are identical (yi = xi), then the covariance is the same as the variance of either (they have the same variances)
  • If one variable is the negative of the other (yi = -xi), then their covariance is the negative of the variance of either.