Correlation is not Causation, but who cares?

Jun 25, 2008 09:00

Fascinating article from Wired.  The basic argument is that the ability process massive amounts of data makes traditional methods of sampling and extrapolation obsolete.

From the article:

"Scientists are trained to recognize that correlation is not causation, that no conclusions should be drawn simply on the basis of correlation between X and Y (it could just be a coincidence). Instead, you must understand the underlying mechanisms that connect the two. Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.

But faced with massive data, this approach to science - hypothesize, model, test - is becoming obsolete. ...

There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot."

100 days, technology

Previous post Next post
Up