Photo - Martin Willcox, Director of Big Data Centre of Excellence (International), Teradata.
One of the key Big Data messages to Malaysian organisations is to always allow business to drive Big Data analytics projects and to take small steps and fast discovery cycles, said the data analytics specialist Teradata.
In his introduction to a gathering of business strategy professionals in Kuala Lumpur from industry and public sector companies, Teradata Malaysia's country general manager Craig Morrison outlined the company's commitment to helping build a national Big Data Analytics (BDA) ecosystem, kick-started by a collaboration with the national ICT agency Multimedia Development Corporation (MDeC) to help build a an Innovation Centre of Excellence (CoE) together with other case uses such as tackling dengue outbreaks.
Using lessons from recent case studies undergoing the company's Discovery Platform, Teradata's director of Big Data Centre of Excellence (Coe) International, Martin Willcox said the "three big waves of BDA, a result of the three disruptive technological innovations- the Internet, social media technologies and the Internet of Things - are impacting businesses throughout the world."
"Malaysia definitely has the potential to become a regional data analytics centre," said Willcox, adding that the Malaysian government agencies' proactive drive to push analytics as an economic driver "will help local companies become more data-centric."
He said the use of BDA has far-reaching potential and outlined examples "that included (a Teradata customer) the creative content and streaming firm Netflix's use of big data analytics to develop "The House of Cards."
Old and the new
Willcox said that while social media was often linked to BDA, new analytics techniques needed to be used in parallel with older techniques to better understand data to find meaningful patterns and relationships.
He admitted that there was still a "certain level of hype" linked to BDA. "One myth is that big data projects need a lot of resources and major planning. Teradata's best practices point to a 'start small, dip toe in water' approach. The notion of exploding data, volume, velocity and so forth deeds a certain existential angst."
"But what is new and exciting in BDA is the move from people interacting with things [analysis of web/clickstream] to people interacting with people (such as Amazon and social channels) and also to things interacting with people," said Willcox.
"About 60 to 70 percent of the project cost could be easily absorbed by a traditional approach in data analytics, which is to spend a lot of time trying to understand and model the data," he said. "To avoid this mistake, start with a business problem and not the technology. There is no need to 'keep up with the Joneses' with the latest technology tool."
"Secondly, what big analytics does in many cases is to extend and enhance existing analyses and business processes rather than replace them," Willcox said.
"There is a huge risk of failure - about 90 percent - so your analytical exploration and discovery cycle times are critical," he said. "So the lesson is to 'fail fast' with cycle times that are measured in days or weeks and not months."
By starting small, a European train manufacturer set out to predict train set failures as there is a general move in industry towards equipment leasing rather than purchase, said Willcox. "Several million train sensor observations and several thousand engineers' reports had produced a messy sensor and paper data mountain. By digitising the engineers' reports, they could proceed to the extraction of significant events, correlated with sensor data [e.g. components], and then start to look for patterns in the data to identify the variables to predict breakdowns."
"The resulting 'path to failure' charts delivered predictive variable that were used to build a decision tree algorithm with 84 percent accuracy in predicting train set failures within a 24 hour period," he said. "The project to get to get to this prediction model stage took just two weeks of work from Teradata's team at a cost of US$25,000. A traditional process of modelling would have taken much longer and cost more."
"This general tight approach can be used in multiple scenarios such as healthcare," said Willcox, and outlined another case study with a major automobile manufacturer that wanted to delve into loyalty analytics. "The company's sales director used sales data but wanted to explore other data (such as warranty claims, social media, delivery process, quality, customising cars, and so forth). He presented the challenge to us to explore customer loyalty churn but it needed to be done within six weeks."
"This was CRM [customer relationship manager] with a twist as we used standard data set but nonstandard processes to allow the data scientists to build two models," he said. "The first model gave 98 percent accuracy but was not actionable as these customers were going to stay loyal. The second combined model identified five things critical for loyalty and helped the company to put in practice actionable items. This project needed hardware and software purchase of half a million us dollars and about $US50,000 in services but the model was so successful that the approach was rolled out throughout the European market."
"These techniques are applicable across a whole range of industries such as network intelligence, credit risk, marketing, customer excellence," Willcox said. "The Teradata-Aster product brings multiple file engines and analytical engines together to do multiple fast analytical experimentation as an entire discovery ecosystem. The industry is moving to analytical ecosystems with multiple technologies and management processes: The future is plural."
Sign up for Computerworld eNewsletters.