Wednesday, May 22, 2013

Seeing the Forest for the Splotchy Green Blob

Lesson for today: statistics is hard. Specifically, it's hard to sort out real patterns from random noise, especially when you don't have a lot of data. Nonetheless, people sure do their darndest.

A little bit randomly — I followed a link from Facebook, then another mentioned on a Web site — I found this: Cancer Cluster or Chance? by George Johnson, a New York Times writer who's been working on a book on cancer for a couple years now. His story in Slate explains something called the Texas Sharpshooter Effect.

I'll leave the colorful details to others, but basically what this effect means is that it's very easy to take a slight uptick in the rate of cancer over time or an area with a slightly higher incidence of cancer and infer that something must have caused that. Referring to stories like A Civil Action or Erin Brockovich, Johnson writes, "the bigger story was how human grief can drive the brain to see cause and effect whether or not it's really there."

The key is "whether or not it's really there." Now, seriously, I'm not taking a stand on whether or not chromium 6 from a PG&E plant in Hinkley, Calif., led to 196 cancers and made Erin Brockovich famous. I'm not an epidemiologist or a cancer specialist or a lawyer for that matter.

But I do know a thing or two about how the brain works, and the thing that isn't key is "grief." It takes very little — actually, nothing at all — for people to find patterns in noise. This manifests itself in all kinds of ways: the belief in streaks in sports, the gambler's fallacy, i.e., the belief that after enough tails, heads becomes more likely, and this particularly entertaining paper. Well, I haven't read the paper, but the title's entertaining.

Again, this doesn't mean that there's no such thing as cancer clusters or maybe even ESP. What it means is that very often we just can't tell, and we have to be awfully careful to avoid seeing the forest for the possibly-non-existent trees.


  1. I am increasingly worried about this and related subjectivity in science. I suspect that we "know" a lot of things that just aren't really true.

  2. I am similarly worried. It's really easy for people to believe that one thing affects another when it's just spurious correlation — never mind that we're also likely to infer causation from correlation.

    Worse, more science in schools won't necessarily help — how many scientists, public policy researchers, and others understand that on average one out of every 20 studies that are statistically significant at conventional levels in social science are probably just noise?

  3. I often feel like proximate cause is a dying concept. At least, it feels that way from a defense perspective.

  4. Ah, I see what you did there. And it's a good point.