Big Data 2Data are good and why, as Neil deGrasse Tyson has been reminding us, logical empiricism is an acid that–eventually–eats through many a research question.

But I’m beginning to understand the extent to which Big Data, while capturing subtle correlations, suffers from a variety of overhead, not the least our era’s penchant for more information and less understanding.

There is, as Gary Marcus and Ernest Davis described earlier this week, the difference between pattern and process (and science and statistics) and whether said correlations are anything more than stochastic.

There is the problem of Nassim Taleb’s fourth quadrant: inopportune impacts other than those that produced the data we’ve collected so far.

There is, depending on the color and shape of statistical noise, the off-chance additional data reduce power. As defensive reactions to Edward Snowden’s revelations about the NSA’s spying remind us, a proverbial needle may be harder to find the more hay added (the latter’s collection, however, may be in the security state’s case entirely the point).

And there is the Foucauldian matter, as Gina Neff puts it, that people–and authorities–imagine and value what data mean and represent in different ways. Indeed, collecting more data might be a way of avoiding giving a problem the resources it requires. Ask any inner-city doctor whose ER has suffered multiple rounds of budget cuts.

Numerical data are vehemently, even violently, political.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: