NewSQL and NoSQL have similarities and differences. The “right” database choice is all about the use case, as always. Depending on what your company is trying to accomplish, you likely have a mix of SQL and NoSQL solutions. And if you don’t, you likely will in the future.
Like NoSQL, NewSQL was designed for modern requirements that include high speed and massive scalability. Also, like NoSQL, NewSQL as a term lacks a precise definition, which can make navigating the landscape more difficult.
Matt Aslett, research director at 451 Research, coined the term NewSQL about five years ago, although in an interview he said defining the new category was “kind of accidental.” At the time, the idea was to recognize a set of vendors that were taking the best aspects of a SQL database and designing new products for modern architecture, specifically cloud architecture.
(Related: NoSQL: It’s all in the name)
“It wasn’t an attempt to define a new category. It was an attempt to describe a group of products that were very similar in terms of vendors trying to do something new with a relational SQL database,” said Aslett. “I often joke that if we knew the term would take off, we would have put more thought into exactly what it was ahead of time.”
Over the years, it’s become obvious that NoSQL solves problems that SQL can’t do well or at all. The complaints about NoSQL as a type of database have less to do with the models and more do to about the non-descriptive and amorphous nature of the term itself. NewSQL has similar problems—the most pronounced of which is the perception of what “new” means. Some consider NewSQL to be a category of databases that has been around for years, if not a couple of decades. Others think of it as more nearer term—within five years. In that nearer-term view, what qualifies as “new” becomes more obvious because similar industry dynamics have been driving the NoSQL frenzy, not the least of which is the need for massive scalability.
Given the unsettled nature of NewSQL’s present definition, Aslett is attempting to define what NewSQL means in a forthcoming paper he is co-authoring with Andy Pavlo, assistant professor of databaseology in the computer science department at Carnegie Mellon University.
“The characteristics [of NewSQL] databases have remained the same for the last few years,” said Aslett. “We see them taking advantage of in-memory storage, partitioning, sharing, and concurrency controls. The implementations differ from vendor to vendor, but there is a common set of technology that takes the best bits of the relational database and applies it to distributed architecture.”
Understanding the landscape depends on one’s view of what constitutes NewSQL. The most commonly referenced players include Clustrix, MemSQL, NuoDB and VoltDB, although SAP HANA, CockroachDB, Amazon Aurora, other databases, some database services, and sharding technologies may qualify depending on one’s view of the landscape.
How one defines the landscape also determines what the market size is. For example, 451 Research estimates that NewSQL vendors generated approximately US$146 million in revenue in 2015, compared to the operational database market ($25 billion) and the NoSQL database market ($814 million). By 2020, 451 Research estimates that the NewSQL market will reach $500 million.
“NewSQL and NoSQL were both reactions to something that wasn’t working: traditional relational databases. [However,] their reactions were completely different,” said Dennis Duckworth, director of product marketing at NewSQL vendor VoltDB. “NoSQL said throw everything out and start with a blank slate. NewSQL said there are some good things about OldSQL. NewSQL systems offer familiar SQL query capability for richer analytics, along with the speed and scale of NoSQL.”
Benefits and limitations
No database is ideal for all use cases, which is why SQL, NoSQL and NewSQL all have their benefits and detriments. But again, those can be colored by what one considers to be NewSQL.
“Traditional SQL provides ACID transactions across partitions [and] multi-way JOINS, and enforces referential integrity,” said Dave Anselmi, director of product management at NewSQL vendor Clustrix. “They’re optimized for structured data, and ensur[e] that there are no update/delete anomalies. They are usually not characterized by a share-nothing architecture, nor can they scale out linearly, especially writes.
“NoSQL ‘relaxed’ the ACID guarantees in search of scale. Typically, if NoSQL provides ACID transactions and JOINS, they’re on a single node only. However, NoSQL typically is shared-nothing, and can scale linearly to hundreds of nodes. NewSQL provides both the RDBMS, ACID transactions and multi-way JOINS across multiple nodes, and can scale out linearly to handle both reads and writes.”
Medium to large companies with high-value, high-transactions loads are using Clustrix for their e-commerce, gaming, ad tech, marketing technology, social and other web applications. In each case, their desire to use NewSQL was driven by scaling limitations.
“Many NoSQL vendors are trying to provide ‘SQL’ semantics, but they’re really only providing SQL language interpreters,” said Anselmi. “What companies really want is the ‘single-source-of-truth’ confidence they’ve always had from enterprise SQL, coupled with the scale of NoSQL.”
VoltDB’s Duckworth said he sees “a lot” of companies hopping off the NoSQL bandwagon because they’ve been burned by the lack of consistency and transactionality.
“We see many organizations across many different industries either migrating to NewSQL or at least investigating it,” said Duckworth. “Ad tech and mobile have already been down the NoSQL path and have learned where it does and does not fit for them. Older, more conservative industries like financial services are still mostly on OldSQL and are slower to adopt new technology.”
Regardless of industry, speed, accuracy, availability, and reliability are becoming increasingly critical to everyday operations and more types of workloads.
“Anything non-operational, more analytical or simple caching/lookups is going toward NoSQL or the analytical flavor of NewSQL,” said Duckworth. “Anything operational (transactional) and real time tends to bring companies to the transactional flavor of NewSQL.”
VoltDB is primarily an OLTP database that has some analytical capabilities. Telco companies are using it for low-latency authorization, policy management, network routing and optimization. Other use cases include ad tech, financial services, smart grids, and game development.
Scalability is critical
Historically, if an organization needed a larger database and better performance, it put its SQL database on a bigger server. The problems with that were twofold: high cost and diminishing ROI in terms of performance. Some turned to commodity servers to scale out, requiring sharding so that the database could run on multiple servers. That approach made it difficult to understand where the data resided across those nodes, and might also make it difficult to execute queries across multiple nodes.
“It gets to the point where you look at a new database that’s designed to run on the architecture instead of trying to get an old database to work on it,” said 451 Research’s Aslett. “The NewSQL players bring something new to the table in terms of transparent sharing, in-memory storage, partitioning, concurrency control, secondary indexes, and replication.”
In terms of use cases, Aslett sees them as being mainly transactional, operations applications such as ad tech, online retail, social media, and of course the Internet of Things (IoT)—use cases that require more frequent data analysis. In the case of IoT, certain data tends to be analyzed on the edge before a subset of it is stored in a data warehouse for historical analysis.
New applications require new databases
NoSQL and NewSQL were both designed for modern architectures. As such, they are generally being used for new applications.
“It’s not so much about migrating existing workflows from existing Oracle and Microsoft databases as much as it is development new applications which, over time, will come online and the older applications may be retired,” said Aslett.
In the meantime, companies are wise to identify which applications are best suited to a particular type of database and what’s driving the decision of one over another: the model, the structure of the database, or the mission-criticality of the application.
“What the NewSQL vendors are not doing is saying, ‘Port your CRM and ERP databases over to our database,’ because nobody is going to do it,” said Aslett.
Many of the early NewSQL adopters have been startups who start fresh with next-generation technologies. Enterprises thinking about their long-term architecture should consider the implications on the database layer and the potential for a more distributed and flexible database architecture, which may make NewSQL vendors worth exploring.
Capgemini has been deploying New SQL and NoSQL platforms on the same underlying infrastructure.
“The good news is, we’ve taken away the constraint of how we store the data,” said Goutham Belliappa, Big Data and analytics consultant for Capgemini North America. “On a Hadoop platform, if I have a NewSQL interface, I can store the data like I store it in Hadoop and express it using NewSQL on top. NoSQL is ideal for use cases in which it doesn’t make sense to use a SQL store at all.”
Expressing relationships is one example, because it is best done using a graph database. A graph database can infer relationships where none have been previously defined. Using a SQL interface, it is difficult even to describe relationships.
“Clients are moving away from their old data warehouse platforms in droves, and a lot of times they’re implementing NewSQL on a NoSQL platform because they don’t have the flexibility they need and we’ve seen cost differences of one to 100 or even higher,” said Belliappa. “There is a confluence of factors that is pushing people away from the way they did things in the past to the way they want to do things in the future.”
One such factor is the software licensing model. Traditional relational database licenses have been expensive, and customers were often limited by vendor lock-in. SaaS alternatives enable cost-effective experimentation and provide greater flexibility.
Consider requirements
Assessing any new database without first considering the problem or use case can lead to several negative consequences, including poor ROI and less-than-optimal outcomes.
“For traditional SQL users, the NewSQL movement attempts to bring the performance at scale that NoSQL implementations often provide without sacrificing the many attributes that make SQL the correct choice for the organization in the first place,” said Vicky Harp, corporate strategist at IT infrastructure performance and SQL Server tool provider Idera. “For NoSQL users, NewSQL may be attractive because of the robustness of SQL, including ACID transactions and fully featured ad hoc capabilities that might require a lot of custom code in a NoSQL implementation.”
There is debate about whether NewSQL is better suited to analytical or operational use cases. “There is very cool technology available in the NewSQL space, but for the most part, these solutions are designed for workloads that can fit into the main memory of a server, so I don’t see it as being appropriate for data mining or large-scale analysis of any sort,” said Harp. “It’s also a pretty costly solution, effort-wise, if you are in a situation with very fluid schema.”
Traditional relational databases struggle to meet ever-increasing performance requirements of elastic, on-demand scale, cloud deployment, and other data center transitions to commodity infrastructure because they weren’t originally designed for those things. NoSQL technologies lack ACID transactions and consequently struggle to offer the data guarantees required by mission-critical applications that deal with valuable data. NoSQL products also forego a server-side programming model such as SQL, with the coherent schemas and data structures that enable efficient server-side processing, according to Barry Morris, cofounder and executive chairman of NewSQL provider NuoDB.
“We now have a database industry in which traditional relational database vendors are trying to add NoSQL-like capabilities to their products, and in which NoSQL database vendors are trying to add ACID transactions and SQL-like capabilities to their products,” he said. “The truth is that customers want both classes of capabilities, and they don’t want to run multiple databases.”
The reality is that a lot of companies are running multiple databases because they’re dealing with different kinds of problems. Although some of the database sprawl is the result of a lack of strategic planning, as the nature of business and technology changes, so do the solutions. This leads to necessary coexistence of SQL and NoSQL databases and/or NewSQL databases.
“We’re finding the market turning toward private, hybrid or public clouds increasing numbers,” said Morris. “They’re looking to lower costs through the commodity hardware and pay-as-you-go pricing, while also simplifying the overall administration of their architecture. Such goals are in direct contradiction to traditional RDBMSes that require expensive, highly customized, pre-provisioned boxes that are costly to maintain, complicated to replicate and challenging to scale.”
The landscape continues to shift
The database market is fragmented, and it will likely get more fragmented in the short term, although some consolidation is already taking place. Amazon AWS, Apple, Dell, EMC and others have been acquiring NoSQL assets. Similarly, NewSQL acquisitions are anticipated by the incumbent players for the usual reasons: eliminating competition, gaining a competitive advantage, and getting fast access to capabilities that would take much longer to build in-house.
“NewSQL hasn’t had a significant impact on the revenue stream of the incumbent relational database players yet, but we see some potential for NewSQL acquisitions,” said 451 Research’s Aslett.”
The longer-term question is whether NewSQL will survive as a database category. In the short term, the amount of database market fragmentation can make navigating the landscape difficult for IT and developers.
“Make sure you understand what you really need for your application,” said NuoDB’s Morris. “What are your true requirements, and what are the nice-to-haves? Are you looking primarily for Continuous Availability, elastic scale-out, or the ability to get real-time analytics out of your database? That might be different than if you’re looking to do Big Data analytics, operate within Docker, or obtain extremely high performance in a single data center. NewSQL isn’t a magic bullet.”
Given the growing popularity of Docker, it’s not surprising that NuoDB is seeing increased interest in database support for it. Meanwhile, Capgemini is advising customers to take advantage of blueprints, rather than reinventing the wheel when it comes to implementing new database technologies.
“Diversity exists to solve different problems and address different use cases,” said Capgemini’s Belliappa. “There are people, vendors, specialists and organizations like ours that have solved [the same] problem. If you start experimenting without learning what exists in the marketplace, then your discovery will be more time-consuming and costly.”
While blueprints serve as guidelines rather than cut-and-paste panaceas, they can help demystify approaches to technology implementations, and more importantly provide a starting point that’s further down the learning curve.
Not everyone is sold on NewSQL
Mike Bowers, principal architect at the Church of Latter-day Saints, is not a fan of NewSQL. In fact, he thinks it’s “dead on arrival.”
“NewSQL databases address the velocity challenge. They scale, they’re there in RAM so they can scale to a high velocity,” said Bowers. “I was really excited about Oracle TimesTen, but the pricing and poor marketing killed it. People who want velocity and have decided to leave the SQL world and go NoSQL have moved to a document database. Developers really don’t like relational databases because of all the modeling and complexities.”
Those adopting NoSQL document databases follow a pattern of adoption, according to Bowers. They start with something simple like Redis, and then move to something more powerful like MongoDB, CouchDB or MarkLogic, and depending on the choice, they may or may not face consistency issues. Bowers uses MarkLogic.
“If you’re looking at NewSQL, do a really strong POC because it’s not OldSQL,” said Bowers. “It’s not going to behave the same, you’re going to tune it differently, and every one of these NewSQL databases differ from your traditional relational databases. They have limitations and there are architectural considerations so really know what you’re getting into and vet the company. You don’t want to invest in a company that’s going to be gone in a few years.”
Robin Schumacher, director of products at multi-model database provider Datastax, recommends looking carefully at the underlying architecture and data model.
“If the NewSQL model fully retains the standard Codd-Date RDBMS model, then it may implicitly inherit all the standard RDBMS limitations of general RDBMSes like Oracle, MySQL, etc. that NoSQL was designed to overcome,” he said. “If the architecture adheres to a master-slave design as most do, then it will fail to adequately tackle the write-anywhere requirements of today’s radically distributed applications, and be prone to outages however small. By contrast, NoSQL databases such as Apache Cassandra, supply a more flexible data model than the RDBMS. Its masterless architecture, tunable consistency, and data distribution capabilities allow it to handle writes from any location and synchronize those changes to all other copies of the data.”