Algorithms Changing Business Intelligence
The most interesting part of the event for me, however, was the glimpse of the future of analytics-and it's not business intelligence, or at least BI as we've traditionally known it. Both the opening keynote, by Kaggle CEO Anthony Goldbloom, and the closing keynote, by Mike Gualtieri of Forrester, focused on predictive analytics.
You may remember the great Netflix contest, in which the company offered a big prize to anyone who could improve its recommendation engine by 10 predictive or more.
The core of that effort was predictive analytics, in which an algorithm-probably dozens or hundreds of algorithms-is unleashed on a subset of a data collection to see if it can discern a pattern of data elements that's associated with some other interesting outcome. When a predictive algorithm is identified, it's set against another subset of the data collection to see if it can predict how that outcome turned out for the records in the second subset.
Gualtieri example was mobile churn rates. A wireless company could examine marital status, payment pattern (early, on time or late), usage amounts and so on to assess whether an analysis across those elements could predict whether a subscriber was likely to terminate his or her contract. (Cynics will state the obvious: Wouldn't it be easier to offer better service as an inducement to reduce churn?)
An obvious extension of this process is evolving the algorithms to further improve predictive power in a process dubbed "machine learning." Kaggle, by the way, focuses on organizing and running predictive analytics competitions, and Goldbloom offered a fascinating example: Could a machine learning system evaluate student essays better than human teachers? The answer was "Yes," especially since the software had far less variance in evaluation than a pool of teachers would.
Better Analytics, Better Performance, But at What Cost?
It seems clear that this kind of machine learning spells the death of traditional BI. Business intelligence, after all, is built on asserting an insight against a set of data-"I think warm weather makes people want to book cruise vacations. Run a report correlating temperature against cruise bookings."
The problem with this approach is that it depends on a human deciding the right correlation among the data. This is where it gets sticky. You're depending upon the judgment-and prejudices-of a person to decide the relevant data to look at. It's much more compelling to let the data identify what is relevant.
That's where this area becomes troubling.
Once you start down the road and say, "Let the data tell me what to do," the natural impulse is to get more data. That mobile company will strike a deal with, say, a consumer products company to get information on purchasing habits of other types of products to enable analysis regarding churn.
Sign up for Computerworld eNewsletters.