Wednesday, February 27, 2008

Take the First Chernoff on the Left

There is no topic, no matter how intrinsically tedious or boring, that cannot be made even more tedious and boring with the right set of Bad Powerpoint Slides.
The converse is also true -- a few engaging images can make even a fairly mundane topic sing and dance. Radiology lectures are an obvious exemplar of this, being filled with lots of fascinating images. However, what if you need to present a ton of data? If so, beware -- your task will be a mighty one. Decades of punitive Powerpoints have pretty well proven that any slide with a big table of data is going to have an LD50 of under 30 seconds.

A lot of creativity has gone into solving this problem. One intriguing method was proposed by Herman Chernoff back in 1973 in his paper: "The Use of Faces to Represent Points in k-Dimensional Space Graphically", Journal of the American Statistical Association 1973;68:361-368.

Chernoff's proposal relies on the notion that the human eye-brain combination is one of the most powerful pattern-recognition engines on the planet. If one is trying to convey a large multidimensional table of numbers, he suggested representing each observation as a computer-drawn cartoon of a human face,
...whose features, such as length of nose and curvature of mouth, correspond to components of the point. Thus every multivariate observation is visualized as a computer-drawn face. This presentation makes it easy for the human mind to grasp many of the essential regularities and irregularities present in the data.
For example, consider the following table of totally bogus data that I made up just now, all by myself. It purports to show income level, quality of life and onerousness of call for 4 physician specialties.

  Income Quality of Life Onerousness of Call
Radiology Medium High Medium
Pediatrics Low Medium Low
Family Practice Low Medium High
Surgery High Low High

Yep, this table is a real dozer, all right. But wait! What if we represent the same data as Chernoff faces? To do this, I fired up my copy of Mathematica and used it to run some sample code I downloaded from the Mathworld site (R enthusiasts see here for sample code). In the plot below, the shape of the eyes, mouth and head respectively represent income, quality of life and onerousness of call. My fabricated data now looks something like this:

Of course, this kind of plot has certain potential flaws. As K. S. Park points out:
A major drawback of Chernoff faces is that the subjective assignment of facial expressions to variables affects on the shape of the face.
No feces, Conan-Doyle character. My fabricated and highly biased data, coupled with an equally biased assignment of facial features, makes that first Chernoff on the left look pretty darned appealing, doesn't it?

A discussion of Chernoff faces wouldn't be complete mentioning the work of Eugene Turner, particularly his prize-winning map titled Life in Los Angeles. He has this to say about it:
I compiled and designed this map which was drafted by Richard Doss. It was an idea based on Chernoff faces. It is probably one of the most interesting maps I've created because the expressions evoke an emotional association with the data. Some people don't like that.
Here's one final fun example from the world of baseball: Alex Reisner's What's the Matter with Chernoff Faces, illustrated by his plot of the 2005 National League season. I especially enjoyed the alternative system he suggests there: Reisner faces.

Update, 2/28/08:   I realize that I sort of left things hanging here, with no final socko conclusion rendered on the utility of Chernoff faces.  This is probably because I still don't have a strong opinion.  However, now that I have the machinery to plot these faces, I'll look for ways to apply them to real data and see how they work for me.  My goal is to avoid using them like the person in Andrew Lang's quote, who
...uses statistics as a drunken man uses lamp-posts -- for support rather than illumination.

No comments: