The analysis of variance is based on this theorem of probability theory, called the law of total variance:

$\underbrace{\mathrm{Var}(Y)}_{\text{total variance}} = \underbrace{\mathrm{E}(\mathrm{Var}(Y|X))}_{\text{within-group variance}} + \underbrace{\mathrm{Var}(\mathrm{E}(Y|X))}_{\text{between-group variance}}$

$Y$ is a response variable, and $X$ is a grouping variable.

Now we generalize this law to skewness. There are many definitions of skewness, but I prefer the third central moment,

$\mathrm{Skew}(Y) = \mathrm{E}((Y - \mathrm{E}(Y))^3)$

which should remind you of the variance, which is the second central moment,

$\mathrm{Var}(Y) = \mathrm{E}((Y - \mathrm{E}(Y))^2)$

﻿

With this definition of skewness, we can get a law of total skewness,

$\underbrace{\mathrm{Skew}(Y)}_{\text{total skew}} = \underbrace{\mathrm{E}(\mathrm{Skew}(Y|X))}_{\text{within-group skew}} + \underbrace{3\mathrm{Cov}(\mathrm{E}(Y|X), \mathrm{Var}(Y|X))}_{\text{variance heterogeneity}} + \underbrace{\mathrm{Skew}(\mathrm{E}(Y|X))}_{\text{between-group skew}}$

Notice the variance heterogeneity term that doesn’t appear in the law of total variance. This says that we can get skew even if there is no within- or between-group skew, because of variance heterogeneity. The idea here is that groups with large means, say, could tend to also have large variances, and this will generate positive skew in the overall distribution. A picture may help here:

This picture isn’t the most technically rigorous (e.g. the heights of the curves is probably off) but I hope it gets the point across — that variance heterogeneity is a component of total skewness.

Conclusion: just like we ask “how much variation in Y can be explained by X?”, we can also ask “how much skew in Y can be explained by X and by variance heterogeneity?”

February 7, 2013 3:06 am

Cool! Not everyday I run into a radically new idea in statistics that is actually useful. Any applications you can point to?

• February 7, 2013 3:27 am

Glad you like it. Not sure how radically new it is. I’m very used to rediscovering old ideas. At least there’s no wikipedia entry!

I got into this stuff in my recent investigations of the influence of intraspecific trait variation on community-level functional diversity dynamics. Using terminology more familiar to evolutionary biologists, it can be used to partition the drivers of disruptive phenotypic selection at the community-level.

OK backing up a bit. I’m planning on doing a post on this stuff soon, but here’s some detail now. Covariance between fitness and a trait is what drives directional phenotypic selection — this is Price’s equation. It turns out that coskewness between fitness and traits drives disruptive phenotypic selection — this is the Lande and Arnold classic (but I don’t think they actually use the word coskewness). There is a conceptual correspondence between functional diversity dynamics and disruptive phenotypic selection (similar to the correspondence Hubbell used in neutral theory). Therefore, coskewness can be used to understand functional diversity dynamics. Now for the intraspecific variation part. Coskewness — like skewness, variance, and covariance — can be hierarchically partitioned into various within- and between-species components. Therefore, partitioning second central moments could help us understand the importance of intraspecific variation and skewness on functional diversity dynamics.

Sorry if this isn’t all that clear. I’m still trying to get all this stuff straight in my head.