Wednesday, March 5, 2008

X-Rays and Bayes

Most radiologists hate statistics. Heck, most of the people I know with higher education and any sense are still nursing a deep grudge against what little statistics was crammed down their throats years ago. Since I actually enjoy number crunching and data analysis, I am considered somewhat of an outlier in my specialty.

It's becoming harder to be a competent physician these days without some familiarity with basic stats. Even in a show-and-tell field like radiology, one needs to know advanced statistical techniques to fully comprehend at least 20% of the articles in the two major U.S. radiology journals. This statistic is probably much higher in the more fundamentalist specialties, such as internal medicine, where randomized, controlled double-blinded studies are considered holy writ.

Why do people get turned off by statistics? Could it be the dense jargon? The plethora of oddly-named statistical tests? The awkward way that one has to phrase and interpret a simple freaking hypothesis test?

For example, consider the following hypothetical exchange (pun intended) between a clinical researcher and a classical statistician:

Q. Which is more effective -- treatment A or treatment B?

A. The null hypothesis that treatment A is not more effective than treatment B is rejected at the 5% level, i.e. P = 0.05.

Q. Er, um, so in other words, there's a 95% chance that they are different?

A. No. It means that if we were to repeat the analysis a bunch of times, using new data each time, then we would only falsely reject the null hypothesis 5% of the time if it were really true.

Criminy. Even radiologists, normally the Jedi Masters of the weasel word, would be ashamed to hedge this badly in one of their dictations.

Fortunately, there is an alternative -- Bayesian statistics -- that allows one to reject the "reject the null hypothesis" school of statistics and couch hypotheses and conclusions in more familiar terms. Like standard English. The name "Bayesian" comes from Thomas Bayes, a Presbyterian minister and mathematician who died in 1761. His eponymic theorem forms the basis for Bayesian inference, and was published in 1764 by a friend, after Bayes' death.

Hmmmm.... 1764 you say? If this theorem is so darned useful, why didn't we start using it a bit sooner than now?

The main reason seems to be that crunching numbers the Bayesian way can be computationally intensive. By "computationally intensive", I mean "impossible without a computer". Even with today's swift computers, techniques such as Markov chain Monte Carlo (MCMC) can eat up a lot of CPU time.

For those of us who are not statisticians, a Who's-Best argument between classical (frequentist) and Bayesian statisticians can sound a lot like a group of Plain-Bellied and Star-Bellied Sneetches. However, there do seem to be a number of potential benefits to adopting the Wayes of Bayes. To help you decide whether you wish to care further about this topic, there is a very nicely written and non-quantitative (and free) primer online:  Primer on Bayesian Statistics in Health Economics and Outcomes Research by O'Hagan and Luce. I'm up to page 20, myself, and it's a page-turner.

For further reading, Kimball Atwood has posted a great series on the utility of Bayesian statistics in clinical research at the Science-Based Medicine blog. A good place to start reading this series would be here.

For now, I'm off to a prior engagement, probably sitting on my posterior and making my way a bit further through the maze of Bayes.

No comments: