But I’m beginning to understand the extent to which Big Data, while capturing subtle correlations, suffers from a variety of overhead, not the least our era’s penchant for more information and less understanding.
There is, as Gary Marcus and Ernest Davis described earlier this week, the difference between pattern and process (and science and statistics) and whether said correlations are anything more than stochastic.
There is the problem of Nassim Taleb’s fourth quadrant: inopportune impacts other than those that produced the data we’ve collected so far.
There is, depending on the color and shape of statistical noise, the off-chance additional data reduce power. As defensive reactions to Edward Snowden’s revelations about the NSA’s spying remind us, a proverbial needle may be harder to find the more hay added (the latter’s collection, however, may be in the security state’s case entirely the point).
And there is the Foucauldian matter, as Gina Neff puts it, that people–and authorities–imagine and value what data mean and represent in different ways. Indeed, collecting more data might be a way of avoiding giving a problem the resources it requires. Ask any inner-city doctor whose ER has suffered multiple rounds of budget cuts.