Thinking About Big Data

I want to pass on to you three pieces on what has come to be known as Big Data, a diverse set of practices enabled by the power of modern computing to accumulate and process massive amounts of data. The first piece, “View from Nowhere,” is by Nathan Jurgenson. Jurgenson argues that the aspirations attached to Big Data, particularly in the realm of human affairs, amounts to a revival of Positivism:

“The rationalist fantasy that enough data can be collected with the ‘right’ methodology to provide an objective and disinterested picture of reality is an old and familiar one: positivism. This is the understanding that the social world can be known and explained from a value-neutral, transcendent view from nowhere in particular.”

Jurgenson goes on to challenge these positivist assumptions through a critical reading of OkCupid CEO Christian Rudder’s new book Dataclysm: Who We Are (When We Think No One’s Looking).

The second piece is an op-ed in the NY Times by Frank Pasquale, “The Dark Market for Personal Data.” Pasquale considers the risks to privacy associated with gathering and selling of personal information by companies equipped to mine and package such data. Pasquale concludes,

“We need regulation to help consumers recognize the perils of the new information landscape without being overwhelmed with data. The right to be notified about the use of one’s data and the right to challenge and correct errors is fundamental. Without these protections, we’ll continue to be judged by a big-data Star Chamber of unaccountable decision makers using questionable sources.”

Finally, here is a journal article, “Obscurity and Privacy,” by Evan Selinger and Woodrow Hartzog. Selinger and Hartzog offer obscurity as an explanatory concept to help clarify our thinking about the sorts of issues that usually get lumped together as matters of privacy. Privacy, however, may not be a sufficiently robust concept to meet the challenges posed by Big Data.

“Obscurity identifies some of the fundamental ways information can be obtained or kept out of reach, correctly interpreted or misunderstood. Appeals to obscurity can generate explanatory power, clarifying how advances in the sciences of data collection and analysis, innovation in domains related to information and communication technology, and changes to social norms can alter the privacy landscape and give rise to three core problems: 1) new breaches of etiquette, 2) new privacy interests, and 3) new privacy harms.”

In each of these areas, obscurity names the relative confidence individuals can have that the data trail they leave behind as a matter of course will not be readily accessible:

“When information is hard to understand, the only people who will grasp it are those with sufficient motivation to push past the layer of opacity protecting it. Sense-making processes of interpretation are required to understand what is communicated and, if applicable, whom the communications concerns. If the hermeneutic challenge is too steep, the person attempting to decipher the content can come to faulty conclusions, or grow frustrated and give up the detective work. In the latter case, effort becomes a deterrent, just like in instances where information is not readily available.”

Big Data practices have made it increasingly difficult to achieve this relative obscurity thus posing a novel set social and personal challenges. For example, the risks Pasquale identifies in his op-ed may be understood as risks that follow from a loss of obscurity. Read the whole piece for a better understanding of these challenges. In fact, be sure to read all three pieces. Jurgenson, Selinger, and Pasquale are among our most thoughtful guides in these matters.

Allow me to wrap this post up with a couple of additional observations. Returning to Jurgenson’s thesis about Big Data–that Big Data is a neo-Positivist ideology–I’m reminded that positivist sociology, or social physics, was premised on the assumption that the social realm operated in predictable law-like fashion, much as the natural world operated according to the Newtonian world picture. In other words, human action was, at root, rational and thus predictable. The early twentieth century profoundly challenged this confidence in human rationality. Think, for instance, of the carnage of the Great War and Freudianism. Suddenly, humanity seemed less rational and, consequently, the prospect of uncovering law-like principles of human society must have seemed far more implausible. Interestingly, this irrationality preserved our humanity, insofar as our humanity was understood to consist of an irreducible spontaneity, freedom, and unpredictability. In other words, so long as the Other against which our humanity was defined was the Machine.

If Big Data is neo-Positivist, and I think Jurgenson is certainly on to something with that characterization, it aims to transcend the earlier failure of Comteian Positivism. It acknowledges the irrationality of human behavior, but it construes it, paradoxically, as Predictable Irrationality. In other words, it suggests that we can know what we cannot understand. And this recalls Evgeny Morozov’s critical remarks in “Every Little Byte Counts,”

“The predictive models Tucker celebrates are good at telling us what could happen, but they cannot tell us why. As Tucker himself acknowledges, we can learn that some people are more prone to having flat tires and, by analyzing heaps of data, we can even identify who they are — which might be enough to prevent an accident — but the exact reasons defy us.

Such aversion to understanding causality has a political cost. To apply such logic to more consequential problems — health, education, crime — could bias us into thinking that our problems stem from our own poor choices. This is not very surprising, given that the self-tracking gadget in our hands can only nudge us to change our behavior, not reform society at large. But surely many of the problems that plague our health and educational systems stem from the failures of institutions, not just individuals.”

It also suggests that some of the anxieties associated with Big Data may not be unlike those occasioned by the earlier positivism–they are anxieties about our humanity. If we buy into the story Big Data tells about itself, then it threatens, finally, to make our actions scrutable and predictable, suggesting that we are not as free, independent, spontaneous, or unique as we might imagine ourselves to be.

3 thoughts on “Thinking About Big Data

  1. cf. http://bds.sagepub.com/content/1/1/2053951714538417.full

    ## Official statistics and Big Data

    > The rise of Big Data changes the context in which organisations producing official statistics operate. Big Data provides opportunities, but in order to make optimal use of Big Data, a number of challenges have to be addressed. This stimulates increased collaboration between National Statistical Institutes, Big Data holders, businesses and universities. In time, this may lead to a shift in the role of statistical institutes in the provision of high-quality and impartial statistical information to society.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s