Skip to content

Still lovin’ the chain rule…

July 2, 2012

Been into the chain rule lately. Something really basic but useful just clicked for me (I’m slow). The (higher-dimensional generalization of the) chain rule is just a matrix product (i.e. just a bunch of inner products)!

The basic chain rule is this:

\frac{\partial y}{\partial x} =  \frac{\partial y}{\partial z} \frac{\partial z}{\partial x}

The cool thing is that this is the chain rule no matter what the dimensions of x, y, and z are. So if y is an n-dimensional vector and x is an m-dimensional vector, then the partial derivative between them is a matrix called the Jacobian matrix,

\frac{\partial y}{\partial x} = \begin{bmatrix}   \dfrac{\partial y_1}{\partial x_1} & \cdots & \dfrac{\partial y_1}{\partial x_m} \\ \vdots & \ddots & \vdots \\ \dfrac{\partial y_n}{\partial x_1} & \cdots & \dfrac{\partial y_n}{\partial x_m}  \end{bmatrix}

So no matter how multidimensional your problem, the rate of change of y with respect to x is just the (matrix) product of the rate of change of y with respect to z and the rate of change of z with respect to x. All very linear and lovely.

This is the coolest part for me: because the chain rule is just a bunch of inner products, its closely connected with moments from probability theory (e.g. means, variances, covariances) because these things are also just inner products. Related posts: here and here.

Advertisements
No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: