King agrees that context is key. He is co-founder and chief scientist of Crimson Hexagon, a big-data analytics firm that, in the words of Wayne St. Amand, its executive vice president of marketing, seeks to provide, "context, meaning and structure to online conversations."
Yet there are increasing examples of data without context driving decisions. The Wall Street Journal reported in February on health insurance companies using Big Data to create profiles of their members. Among the things the companies tracked was a history of buying plus-sized clothes, which could lead to a mandatory referral to weight-loss programs.
Few people would argue with encouraging people to live healthier lives, but the privacy implications are disturbing. It is possible the person buying those clothes might have been doing so for another family member. And it is not always so benign. Bloomberg BusinessWeek reported in 2008 on individuals being denied health insurance based on a history of prescription drug purchases that suggested even minor mental health conditions.
Adam Frank, writing on the National Public Radio blog, noted that in some cases banks will deny a loan to someone based in part on their contacts on the employment networking site LinkedIn or the social networking behemoth Facebook. If your "friends" are deadbeats, your credit-worthiness may be based on their reliability.
Frank quoted Jay Stanley, senior policy analyst at the ACLU, noting on that groups blog that, "Credit card companies sometimes lower a customer's credit limit based on the repayment history of the other customers of stores where a person shops. Such 'behavioral scoring' is a form of economic guilt-by-association based on making statistical inferences about a person that go far beyond anything that person can control or be aware of."
Kim Jones said the tendency to jump to a conclusion from correlations without further analysis could have affected him personally. "During the late '80s and early '90s, data showed that Hispanic and Black males between the ages of 20 and 27 who were driving an entry-level luxury car on the I-95 corridor were likely to be drug runners," he said.
"I fit some of that profile — I'm African American, I was that age and at that time I was driving a car like that. But if I had been stopped, the police would have seen that I was wearing an Army uniform with Second Lt. bars and had a West Point ring," he said.
The point, he said, is that, "its always bad to rely just on data analytics. When you take the human element out of the equation, you by definition create a higher error rate."
In short, Big Data is a tool, but should not be considered the solution. "It can help you narrow something down from millions to perhaps 150," Jones said, "but the temptation is to let the computer do it all, and that is what is going to get you in trouble."
Sign up for Computerworld eNewsletters.