# Non-standard covariance stuff involving three variables

For some reason, three different completely unrelated projects have led me to some interesting ideas about covariance and three-variable relationships.

I’m going to start with the ugliest one. Given `cov(xy,z)`

, what does it take to move the `y`

to the other side of the comma (i.e. `cov(x,yz)`

)? As far as I can tell there’s very little that can be done with the usual `cov`

, `var`

, and `E`

, at least nothing pretty. Here’s the best I’ve got:

`cov(xy,z) + cov(x,y)E(z) = cov(x,yz) + E(x)cov(y,z)`

If anyone’s out there…is there anything prettier?

I don’t really expect anyone to actually be out there…I’m mostly doing this blog so that I don’t forget things that I’ve done.

The second one is much cooler. Consider this definition of a three-way covariance:

`cov(x,y,z) = E((x-E(x))(y-E(y))(z-E(z)))`

What does this ‘covariance’ mean? Well, consider the following graph of the relationship between x, y, and z,

Here’s the `R`

code:

```
install.packages("popmoments", repos="http://R-Forge.R-project.org")
library(popmoments)
library(ggplot2)
```

```
# generate three-way covariance
x <- seq(-1, 1, 0.01)
y <- seq(-1, 1, 0.01)
xy <- expand.grid(x = x, y = y)
xyz <- within(xy, z <- -x*y)
```

```
# plot
ggplot(xyz, aes(x = x, y = y, fill = z)) +
geom_tile() +
scale_fill_gradient2() +
theme_bw()
```

```
# three-way covariance
with(xyz, cov.(x, y, z))
```

Is `cov(x,y,z)`

positive, negative, or zero? The key is to look at how the correlation between two of the variables changes with the third. Here `y`

and `z`

are positively (negatively) correlated when `x`

is low (high). Since positive goes with low and negative goes with high, `cov(x,y,z) < 0`

. One cool thing about this example is that `cov(x, y, z) = -0.11`

but all of the pairwise covariances are zero, `cov(x, y) = cov(x, z) = cov(y, z) = 0`

. By the way…does anyone know what Cauchy-Schwarz might look like for this three-way covariance?

For the last thing, `z = x + y`

. What is the relationship between `var(x)`

, `var(y)`

, and `var(z)`

? What constraints are there on `var(x)`

and `var(y)`

, to ensure that `var(z)`

is well-defined? The first question is easy:

```
var(x) = var(z) + var(y) - 2cov(y, z)
var(x) = var(z) - var(y) - 2cov(x, y)
var(x) = var(y) - var(z) + 2cov(x, z)
```

```
var(y) = var(z) + var(x) - 2cov(x, z)
var(y) = var(z) - var(x) - 2cov(x, y)
var(y) = var(x) - var(z) + 2cov(y, z)
```

```
var(z) = var(x) + var(y) + 2cov(x, y)
var(z) = var(x) - var(y) + 2cov(y, z)
var(z) = var(y) - var(x) + 2cov(x, z)
```

The answer to the second question is in this graph:

The grey region is the mathematically possible region and the signs in the sub-regions give the signs of the correlations between `x`

and `y`

, `x`

and `z`

, and `y`

and `z`

.

Hope there are no mistakes!

Hello,

I was looking for some sort of generalization of the covariance for general variables, exactly like the definition that you give here of a “three-way covariance”. Is this an established definition? If it is, could you please give me some references?

I have not found any other similar mention on the web…

Thanks!

Yes you are quite right, these ideas are difficult to find. Plus, my blog statistics suggest that there are many people like you and me who search for things like ‘three-variable covariance’ etc. So this has got me wondering why such an intuitive idea has not been well-explored by really smart people in all kinds of disciplines. It could be that its not worth the time required to make a standard definition (see below), or it could be that it just hasn’t yielded much insight for many people. I don’t know yet, but I think its important for people like you and me to ask ourselves these questions. That’s one important part of scientific thinking: ‘why hasn’t this been worked out already by all those many many many smart people who came before?’ I don’t know in this case.

Having said that, these ideas are actually fairly common in finance. Check out this page for some pointers:

http://stats.stackexchange.com/questions/50948/what-is-coskewness-and-how-can-it-be-calculated

Also, this slightly newer post may help too:

https://stevencarlislewalker.wordpress.com/2012/05/31/coskewness/

It points out that a better name might be coskewness, and points to a short wikipedia blurb. Another thing here of interest is that the covariance of a variable with the product of the deviations of two other variables is the same as what some call the coskewness of the three variables. So it might just be easier to refer to coskewness as a particular kind of covariance, which is already well-defined.

Anyways, this is a bit of a rabbit hole. Have fun!

Thank’s for the answer!

Well, yes, that’s exactly how I feel, “Why has not anybody developed this idea yet?”, and I think there are two possible answers: 1) Actually, smart people thought of it, and concluded that there’s is nothing new with it or 2) Nobody has ever needed it (and, after all, statistics is pulled by applications).

My motivation was something called Structural Equation Models, that is based on Covariance Structure Analysis. Basically, there you build a model and you compare your “model implied covariances” to real data covariances, to conclude if your model fits the data or not.

Then I though that it could be valuable not just to compare the traditional 2variable-covariances, but all kind of generalized Nvariable-covariances. Maybe the idea is stupid or useless, because maybe all the information that the N-covariances encode is already encoded in the 2-covariances…but I don’t know! I think it would be useful, at least for me, to dedicate some time to explore the idea. And the fact that in Probabilistic Graphical Models (such as Bayesian Networks) you can have factors over several variables encourages the thought that this is a good direction to explore.

And a part from my initial motivation, I think that the idea behind those N-covariances might be more general and useful than expected!

If I ever read something nice about it, I will post a link here.

I’d love to hear about what you find out.

In terms of SEMs, one quick thought might be to model the covariance between one variable and the product of the deviations of two variables: cov(X, (Y-E(Y))(Z-E(Z))) which is equal to coskew(X,Y,Z) = E((X-E(X))(Y-E(Y))(Z-E(Z))). But in the cov form you should be able to just apply SEM methodology.

Hello,

Suppose x,y, z has multivariate normal distribution with mean vector (M1, M2, M3).

Cov(x,y,z) = E[(x-E(x))(y-E(y))(z-E(z))]

= E(xyz) − E(x)E(yz) − E(y)E(xz) − E(z)E(xy) + 2E(x)E(y)E(z)

= M1M2M3 + M1 Cov(y,z) + M2 Cov(x,z) + M3 Cov(x,y)

– M1[Cov(y,z)- M2M3] – M2[Cov(x,z)-M1M3] – M3[Cov(x,y)- M1M2]

+ 2M1M2M3

= 0

Why does this definition lead to covariance = 0 for normal distribution?

I really shouldn’t call this covariance. Its better referred to as coskewness. And indeed, the trivariate normal has zero coskewness. More informally, the covariation between any pair of variables does not itself covary with any other variable in a multivariate normal distribution.

Thanks for reply .

what would your suggestion to calculate the association (measure of dependency) for a trivariate normally distributed random variables

The covariance matrix. This is pretty standard advice. If the variables are truly trivariate normal, there are no higher order associations to worry about, only pairwise and linear ones, so the covariance matrix is all you need.

Hi Steven,

I was lucky to come across your posting from July last regarding coskewedness and covariance. In short, i need to be able to define a parameter like the bivariate r-value to show some sort of skewedness amongst 3 or 4 variables. I am not completely sure how such a parameter would be calculated, but I cam across a posting on an external site stating the following:

Perhaps you need the theory of cumulants also called semi-invariants.

The third cumulant generalizes v(X,Y), the covariance between two random variables, into the following:

c(X,Y,Z) = E(XYZ) – E(X)E(YZ) – E(Y)E(XZ) – E(Z)E(XY) + 2E(X)E(Y)E(Z)

The possibility for the analogue to r could be:

c(X,Y,Z)/sqrt(v(X,X)*v(Y,Y)*v(Z,Z))

What do you think? The assumption here is that random variables are discrete in origin.

Regards,

Robert.

I think this is cool, thanks. Don’t have time to think to hard about it now, but I’d love to have a Cauchy-Schwarz-like relationship for third central moments. I did a few numerical computations and your relationship worked nicely.

Good to know. Well, you have more grounding in this area, i’ve only ever understood stochastic processes and random variables up to graduate level. Insofar as the E(XYZ) term is concerned, how would you calculate it?

Came across this blog while searching for an expansion of Cov(X, YZ) into a series of bivariate covariances. Any suggestions?