The linear algebra of statistical philosophies contains four basis vectors,
Exploratory analysis, $\vec{t}$ (after Tukey)
Frequency analysis, $\vec{f}$ (after Fisher)
Probability modelling, $\vec{b}$ (after Bayes)
Predictive analysis, $\vec{g}$ (after Geisser)
Any statistical philosophy, $\vec{p}$, can be written as a linear combination of these four basis vectors,

$\vec{p} = \underbrace{\tau\vec{t}}_{\text{exploration}} + \underbrace{\zeta\vec{f}}_{\text{p-values and conf. intvls}} + \underbrace{\beta\vec{b}}_{\text{Bayes}} + \underbrace{\gamma\vec{g}}_{\text{prediction}}$

The coefficients, $\tau, \zeta, \beta, \gamma$ depend on the data, the objectives of the study, and the investigators. My usual position in this space has

• $\tau \gg 0$ (I’m a community ecologist…things are complex…as if we could decide on statistical procedures before looking at the data!)
• $\zeta > 0$ (I’m great at making up stories based on my explorations, so I need p-values and confidence intervals for catching my cognitive biases.)
• $\beta > 0$ (Probability modelling helps me to better understand the information in the data, and therefore to obtain better estimates and models)
• $\gamma \gg 0$ (I’m an observer, not an experimentalist, so I find predictive ability to be a much better guide to model performance and understanding than causal considerations…but that’s just me.)

Jeremy Fox would probably look more like this,

• $\tau = \epsilon$ (He likes experimental designs that isolate effects a priori, and probably likes to minimize any subjective exploration that might invalidate his estimates of error rates…but probably also recognizes that a bit of exploratory analysis is necessary for practical purposes)
• $\zeta \gg 0$ (Being experimental, his error rates are very meaningful)
• $\beta \ll 0$ (He hates Bayes…but may be coming around? 😉 )
• $\gamma = \epsilon$ (He’s a causal kind of guy…I’d bet that prediction for him comes from understanding and doesn’t need to be the focus of statistical analysis)

Statistical agnostics will be near the centre of the space,

• $\tau \approx 0$
• $\zeta \approx 0$
• $\beta \approx 0$
• $\gamma \approx 0$

• $\tau \gg 0$
• $\zeta \approx 0$
• $\beta \approx 0$
• $\gamma \approx 0$

This is all a bit silly, but the point is: There are good reasons for choosing your ‘coefficients’ in ways that are appropriate for your kind of science, but I think its a little naive to think that your favourite position in statistical philosophy space is for everyone. Throughout centuries of research and debate on statistical philosophy, these four basis vectors have persisted because the many many many critiques leveled against any one of them have always had reasonable responses. Examples,

• $\vec{t}$ can reduce objectivity (e.g. invalidate estimates of type I error), BUT without it we run the risk of missing something unexpected and important that the data are telling us.
• $\vec{f}$ can leave us wondering what error rates are relevant (e.g. the confusion surrounding multiple comparisons), BUT how else can we probe our deep cognitive potential for seeing something in the data that isn’t there?
• $\vec{b}$ leads to paradoxes associated with prior distributions and other unwanted sensitivities to model specification, BUT it has often been very successful in practice at leading to reasonable estimates that are evaluated well by $\vec{t}$, $\vec{f}$, and $\vec{g}$, and also provides a good way to formalize our thoughts on a particular data generating process.
• $\vec{g}$ can lead us to predictively successful models that mislead with respect to causality, BUT it provides us with a reality check about how ‘useful’ our ‘significant’ results might actually be at reducing our uncertainty about the unobserved.

EDIT: I forgot decision theory as an important basis vector, it just doesn’t come up much in my own research.

October 24, 2012 2:35 am

I was going to post my position in 4-D vector space, but then I realized it was exactly the same as yours. In the grad stats class I teach, I make the students write a short paper, analyzing not just the statistical methods they will use but also asking them self to locate themselves in 3-D space (I don’t have your Bayesian/probability dimension). Students always seem surprised to be asked to think about their goals and objectives at this fundamental a level.

• October 24, 2012 1:35 pm

Ha! I guess it makes sense we’re the same kind of methodologist…you’ve certainly had an influence on my approach.

About your students…its funny…I’ve always had the opposite problem: I’m surprised when its time to stop thinking at a philosophical level and start actually figuring things out about nature!

October 24, 2012 2:39 am

3. October 24, 2012 7:58 pm

This is fun Steve, there’s a plug at Dynamic Ecology on its way on Friday.

And yeah, you’ve placed me pretty well. One thing about blogging is that people can place you like this. At this point, my views are probably sufficiently well known that actually blogging about them is superfluous. Everybody knows, or can reliably guess, what I think about everything I think about. 😉

Re: prediction, yeah, I tend to care more about understanding and explanation than prediction. I also think that, if the goal is out-of-sample prediction (i.e. extrapolation), then causal understanding is essential. Approaches that aim to be purely empirical, data-driven, and phenomenological are apt to fail miserably for out-of-sample prediction.

• October 25, 2012 1:22 pm

Sorry for not replying sooner…I was baby sitting last night and the little tyke wasn’t interested in blogging for some reason.

Re placing you: I think that this is a positive about blogs. I think they fill a role that peer review can’t fill. With peer review, you can’t just say what you think because you might offend a reviewer or editor. This leads to a tip-toeing style of writing that can be boring. Gelman (http://andrewgelman.com/2012/09/speaking-frankly/) has made another point in this direction, which is that in blogs you can be honest about your own assessment of your work (i.e. point out all the benefits AND the problems). Whereas in journals there’s a temptation to relentlessly sell your ideas even if you’re not sure of them yourself. I’m just thinking out loud here, but why do we need to replace peer review journals? Hope I’m not getting my history of science wrong, but wasn’t there a time before journals when everyone just published books? Some scientists must have been skeptical about journals, while others adopted them. And in the end both have coexisted just fine. Maybe now we’ll just have three types of outlets for our work: books, journal articles, and internet self-publishing. (You’re probably beginning to place me as one of those, to quote Rodney King, ‘why can’t we all just get along?’ types).

Re prediction: When I say prediction I *always* mean out-of-sample prediction (or estimates of out-of-sample prediction like cross-validation, AIC, DIC—but NOT BIC). Within-sample assessments (like R^2) are in the direction of exploratory analysis (or frequency analysis if they are coupled with tests of significance or use adjusted versions of R^2), but definitely not predictive analysis. Its a bit of a pet peeve of mine when people use R^2 to justify statements like ‘x predicts 50% of the variation in y’ (and no, using adjusted R^2 doesn’t help here). Prediction means you’ve made a mutually exclusive separation between data used for modelling and those used for testing those models. CAVEAT: A related issue has been studied by Volker Bahn (and others who I can’t remember right now) on how dividing data in this way effectively assumes IID (independently and identically distributed) data…but spatial and temporal autocorrelation may invalidate the independent part of this assumption.

Also, there’s a difference between out-of-sample prediction and extrapolation: all extrapolations are out-of-sample predictions, but NOT all out-of-sample predictions are extrapolations. Extrapolation means that you fit your species distribution model in Ontario and then test it in Quebec. Out-of-sample prediction that’s not extrapolation (i.e. interpolation) is when you do the testing at sites in Ontario that are interspersed (spatially, temporally, and environmentally) amongst the original Ontario sites. There is very interesting work in this direction (e.g. http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00170.x/abstract?systemMessage=Wiley+Online+Library+will+be+disrupted+on+27+October+from+10%3A00-12%3A00+BST+%2805%3A00-07%3A00+EDT%29+for+essential+maintenance&userIsAuthenticated=false&deniedAccessCustomisedMessage=).

More generally the idea is that predictive analysis can provide more comprehensive inferences if you both interpolate and extrapolate.

One final thought is that the Bayesian direction in philosophy space may have some interesting solutions here, by using hierarchical models that mix both interpolation and extrapolation: http://andrewgelman.com/2012/06/hierarchical-modeling-as-a-framework-for-extrapolation/