“I think the analytics platforms are coming along with access to data. We’ve had a large data accumulation phase, and people are seeing that you can get things out of that. When you’re asking, ‘What do we do with all these marketing visits to our website?’ it winds up being more data than you could point Crystal Reports at.”
Because all of this data is being saved, the natural business instinct is to do something with it. The trick is to actually get information out of the data, a task that requires highly skilled workers—and more often than not—SQL.
Unfortunately, said Harp, the market has realized this as well, and has essentially flooded customers with choices. That means there’s a lot of turbulence and no clear market leader when it comes to SQL on Hadoop, or even analytics on Hadoop.
“You need to have more data science and actual analytics capabilities,” said Harp. “The space is a place where there’s a need. We’re seeing there’s a lot of jostling in the Gartner Magic Quadrant on that in 2105 and 2016. We saw a lot of people drop in terms of their ability to execute, which I thought was interesting. It’s a space to continue to watch. We’re also seeing vendors move in and out of that magic quadrant. It’s not like they’re having trouble finding vendors. In the 2016 version, even Oracle fell off.
“There is demand in the market for people to do what they’re comfortable with, and at the same time, it’s the relational database providers who are seeing what their users want. It depends on what you’re looking at it. Is this relational on top of Hadoop versus…Hadoop working with SQL Server or some other platform where you are mixing the two types of data?”
Indeed, Hadoop has muddied the waters around big enterprise data analytics, thanks to hundreds of vendors now offering compatible products to analyze the mountains of data that come from a modern enterprise.
Monte Zweben, cofounder and CEO of Splice Machine, has built a company to deliver ACID transactions on top of Hadoop. That means SQL users can use their Hadoop cluster as they would typically use a relational data store.
“I don’t think it’s the language [SQL] that I would argue is the new innovation; it’s the workload using the language that’s going to be unique,” he said. “I see the world bifurcating. What I mean by that is, there was this heavy push to do rapid ingestion of data. The NoSQL guys glommed onto that. Then there’s this other world of people doing big batch analytics. This is where the Hadoop world has gone. All SQL on Hadoop is focused on that: big batch analytics.