The catch-22 of slide-22: Pedagogical troubles with conjugate priors
I just went through Corey Chivers’ slides for his introductory lecture on Bayes. It got me thinking about something I’ve been thinking about for a long time: How should non-quantitative biologists be introduced to the technical challenges of normalizing constant problems in Bayesian inference? On slide 22 Corey says that there’s ‘not always a closed form solution possible!!’ for these problems. When I first read this I thought that I’d have a criticism with Corey’s slides, but it turns out that slides 23-56 beautifully addressed any concerns I might have. Nevertheless, I’m going to take this opportunity to get some things off my chest. My question is, should we say ‘not always a closed form solution’ or ‘never a closed form solution’ when teaching this stuff to biologists with a limited quantitative background? Either is certainly better than not mentioning the problem at all I think.
My first reaction to slide 22 was…well…ok…sometimes you get conjugacy (which in practical terms basically means you can calculate the posterior distribution in Excel if you want), but usually when this happens the statistical problem is simple enough that using Bayesian methods probably do more to complicate matters than to help — because with reference priors most conjugate models lead to very similar conclusions as standard classical approaches. So the normalizing constant is closed form when its not very useful, but not closed form when it is. In other words…for all practical purposes the normalizing constant is never closed form, not just ‘not always closed form’. Exception: if you actually have prior information to feed into the hyperparameters of the conjugate prior, then the ‘closed form’ thing can be useful — but i never really have any believable and strongly informative prior information in my work…its ecology right? who knows what’s going even if you have data! ;)
The reason why I think this is an important point goes back to when I was first learning this stuff. At that time I found Bayes to be intriguing but didn’t really have the technical skill or philosophical maturity to put it to good use. This naturally led me to using all these conjugate prior methods. I was attracted to the idea that conjugate posteriors could be calculated in Excel. But in the end, to put it bluntly, these conjugate prior methods (on their own) provided me with absolutely zero insight about my data that I wasn’t able to get from classical methods (e.g. OLS, t-tests, ANOVA, GLM, chi-square, bootstrap, permutation test). But since I had some inertia learning the stuff, I kept going with it. As I learnt more Bayes I got progressively more and more frustrated — either the problem was too technically challenging for me, or it was so simple that I didn’t need Bayes. It felt like I was either killing a fly with a gun or flying a fighter jet without a pilot’s license.
But then at some point something clicked. I started to get the technical skills needed to use Bayesian methods in a way that actually provided insight that I don’t think I could have acquired any other way. But since getting to this point, I’ve never used a simple conjugate prior model. They’re just not worth it…may as well do a t-test. There’s one caveat here: understanding conjugacy is often useful for constructing more elaborate (and more useful) Bayesian learning algorithms, but this understanding won’t help you until you’ve already learned tons of other stuff.
The catch-22 (of slide-22) of course is that I probably wouldn’t have made it to this point without first believing that conjugate prior models would actually help me in my research. When someone says, ‘you can get started using this wonderful thing called Bayes after about a week of learning about these things called conjugate priors’, that sounds pretty good. But if someone says, ‘ok start with these useless things called conjugate priors, and then in **SUBSTITUTE EMBARRASSINGLY LONG TIME HERE** you will be doing some really cool stuff’, you may think twice about whether or not Bayes is worth it…especially given how hard biology is in the first place. In a different context, Jeremy Fox puts it well: ‘…R packages haven’t reduced the money, time, or physical effort required to conduct field experiments’.
There’s of course no ‘correct’ way to introduce the technical challenges of Bayes to non-quantitative biologists. As always with pedagogy, it depends on the student. But my point is: ethically, I think teachers of Bayes have a responsibility to make sure our students understand the time investment that may be required before it really pays off. Especially if you’re slow like me.
Who knows if I’ve been better off as a scientist as a result of having gone through all of this learning. On one hand, I have a much much much deeper understanding of statistics in general (not just Bayes) by having learnt Bayes. But on the other hand my natural history knowledge has REALLY suffered and to a lesser extent so has my theoretical ecology knowledge. Don’t get me wrong…I love stats and get loads of satisfaction from my work, but would I have become a better ecologist by not having gone through this whole Bayesian phase? Not sure. I’ve often thought that, like figure skating, PhD’s in ecology would be more interesting as a pairs competition: someone to do the math (or some other technical thing like molecular work) and someone to do the ‘ecology’.
EDIT: Ben Bolker’s got one of the few balanced papers on this topic.