Big Data is changing, but in order for businesses and users to really derive value from it, there needs to be innovative approaches to keep driving it at scale. Strata + Hadoop World kicked off this week with industry thought leaders expressing the requirements needed in today’s Big Data world.
Mike Olson, chief strategy officer of Cloudera, talked about how Big Data is moving to the cloud. While, he noted, that 92% of all IT is happening on premises and only 8% is happening in the public cloud, he predicted dramatic growth. “We are going to see it absolutely skyrocket in the years to come,” he said.
But in order for Big Data to move to the public cloud, the cloud and cloud solutions need to provide a few key requirements. According to Olson, it needs to be elastic in order to grow and shrink on demand; it needs to provide consumption-based pricing so users aren’t playing for commute cycles when they aren’t doing work; it needs spot instances to drive costs down; it has to deliver the same on-premises enterprise capabilities such as security, compliance, encryption and key management; it needs to be portable across the firewall; and it needs to support multiple clouds.
Jack Norris, senior vice president of data and applications for MapR Technologies, talked about what users and businesses should be looking for in their platforms. “Your future competitiveness depends on making the right choice,” he said.
To that end, Norris explained that the platform a business chooses needs to scale, support legacy applications, integrate with existing systems, support real-time and integrated analytics, and provide enterprise scale.
Businesses also need to take an analytics operational approach to Big Data in order to be successful in the enterprise, according to Ron Bodkin, founder and CEO of Think Big Analytics, a Teradata company. The way Think Big Analytics has seen success in deploying analytics in the enterprise included building and adding a data lake, expanding access to data through data democratization, and applying analytics ops.
A data lake should provide data the business can trust instead of acting like a dumping ground for data. And analytics ops should include constant monitoring, A/B testing, test and deployment, automated training and scoring, and application integration. With these practices in mind, Bodkin believed businesses will realize and discover repeatable value.
But it isn’t only about the tools when it comes to finding insights. “We create tools that help us see things,” said Pagan Kennedy, a New York Times contributor. “But even with these dazzling tools, it is hard to know how to think about our ignorance.”
Kennedy explained that there is so much in the data world that is unknown and in order to discover what we don’t know, we have to stumble upon it. We have to be serendipitous, “a talent or ability to find valuable clues you aren’t looking for,” she said.
Today, the word “serendipity” is often referred to as dumb luck, but Kennedy explained that is the opposite of what it really is supposed to mean. It means to improvise and to learn from your mistakes. “[There is a] tendency to become so dazzled by tools we forget what is driving this new age of discovery,” she said.
The new age of discovery is human creativity combined with data tools in order to explore the unknown, said Kennedy. “Your serendipity is valuable because it is like no one else’s, and the more it is different from everybody else’s, the more valuable it is,” she said.