Big Data has only gotten bigger in 2013. Blame it on enterprises deploying Hadoop to production environments, or on NoSQL users who are spreading data across hundreds of servers around the world at the same time. No matter who’s responsible, however, 2013 was definitely the year in which Big Data became Big Business.
Cloudera has long been the standard bearer for Hadoop, and its market dominance continued in 2013. But the unexpected freight train of power in the ecosystem in 2013 was Hortonworks, which grew to a size rivaling Cloudera’s, despite being founded two years after it. As the company behind vanilla Apache Hadoop, Hortonworks has drawn numerous new customers to its offerings in 2013 thanks to its leadership position in building and rolling out Hadoop 2.0.
And if there were no other Hadoop news in 2013 than Hadoop 2.0, we’d still be able to log pages and pages of dramatic changes that will occur because of this shift in the platform. This was the year Hadoop went general-purpose, and eschewed Map/Reduce as the only method of batch processing.
It was also the year the HDFS became highly available, and the year when a dozen new side projects cropped up in the Apache Incubator: Project Falcon focuses on building data pipelines; MRQL is a query-processing optimization system; the Knox Project locks a cluster down to a single access point; Sentry is a security project; and Project Tajo offers a distributed data warehouse system on top of Hadoop.
With all that activity, it’s not surprising many companies saw success with their Hadoop integrations and offerings in 2013. MapR, Cascading and Zettaset all continued to fill holes in the Hadoop ecosystem, while a host of new analytics offerings came to market on top of Hadoop from companies like Tableau, Pentaho, Splunk and Datameer.
But Hadoop was not the only story in Big Data for 2013. NoSQL databases continued to gain traction thanks to a never-ending need to spread data around the globe in a highly available and consistent form. To that end, a number of new transactional databases, some calling themselves “NewSQLs,” cropped up this past year. NuoDB, FoundationDB and VoltDB all brought databases to market in 2013 that offered transactional support based on the ideas and techniques shown in the Google Spanner paper.
The big three of NoSQL, however, continued to fight it out over customers and market share. DataStax and Cassandra continued to be the most robust NoSQL solution, popular with the Wall Street crowd. MongoDB took its lumps this year, but also addressed many issues in a powerful point release, such as the default write reliability inherent in the datastore. Couchbase divorced from its CouchDB roots and took a new path, inspired by the melding of both Membase and CouchDB.
What does 2014 herald for Big Data? Companies to watch include Basho, the company behind Riak; Sqrrl, the company behind Accumulo; and Hortonworks, which should continue to grow at a fast rate.