NoSQL: It really is all in the name

Published: July 29th, 2015

- Alex Handy

For most of the 1990s, databases were the most boring tool in the shed. The rise of the Web over the aughts changed the demands placed on databases, but did not meaningfully change the form of the data stores we so know and love in our day-to-day application work.

The constraints placed on applications to perform at Web scale only could be overcome by specialty database vendors such as FairCom and Oracle. The advent of the cloud, however, brought these Web-scale problems to the forefront, and around 2007, things began to change drastically on the database landscape.

Some of these solutions, like Tokyo Cabinet, Redis and Apache Cassandra, took the approach of spreading a key value store across many servers. Others, like MongoDB and CouchDB, approached the problem from the document model, stashing data in many forms. Over time, a third path was demanded by the market, however.

Enterprises live on SQL. They employ SQL developers. They hold treasure troves of SQL-accessible data, and they understand SQL processing as a workload. Thus, the quickly rising buzzword “NoSQL” began to stretch out and encompass these new ways of storing SQL data.

“Not Only SQL” is now the call, and it’s used to highlight the myriad choices developers now face when choosing a database. The days of picking between a CSV file and Oracle are behind us. Today, developers can choose the right database for the job, and that’s not just some broad, three-tier category. Databases are as varied as birds or insects, and carry with them just as many adaptations to exist in their niche.

What’s driving uptake of NOSQL
Evaldo H. de Oliveira, director of business development for FairCom, said that the database world has been moving very quickly in the past few years, but a lot of the new ideas being discussed aren’t entirely new.

“Our database is different than others. We’re not a pure relational database,” he said. “We’ve always been a bit more designed for high-performance applications. If you follow the database market, the majority of the market in the last year has been about SQL. We have SQL as well, but our technology has been designed to use and take advantage of the non-SQL way to handle the data, so that’s made a difference for us.

“It was funny when everyone started talking about NoSQL. That’s been our business forever.

“The definition of NoSQL starts with a ‘No.’ Something that’s not, which can be a lot of things. There are multiple different types of NoSQL databases. There are graph databases, document databases. There are multiple different types of databases. One of them is [the] key-value store. That’s where we fit.”

Those old ideas are coming in handy because of the urgent business need for information, analytics and real-time information processing. de Oliveira said there is a growing need in enterprises for “real time analytics: the kind of thing that needs to run on a single version of the truth.

“There are a lot of analytics going in one direction, which is real-time analytics on top of transactions. For a credit card system, you have their credit card number and you’re processing the transactions in real time, live. But at the same time if the systems have a full real-time system like ours, you can run analytics on top of that. Because of the Internet of Things, a lot of customers don’t want to do big analytics on batch the next day; they want to run it live on real-time systems. In the next five to 10 years, [businesses are] going to need something more in real time.”

Big Data dealing
Jack Norris, CMO for MapR, agreed that interest in NoSQL is being driven by business analysts, who want that up-to-the-minute view of the business through analytics. To this end, Apache Hadoop has dominated a lot of the talking points around modern Big Data processing.

“There are some uses where I can do some of my ETL processing on Hadoop, and use that as a refinery,” said Norris of the practice of storing everything in Hadoop, then pouring out select datasets into other databases for analysis.

This tactic has been particularly popular for data stores that have connectors to Hadoop, such as Apache Cassandra. But Norris said that the database a developer chooses for this is only half a solution. Instead, he said, keeping that data in Hadoop and doing analysis there means faster access to data.

“People are starting to understand what the technologies are good at and what they’re not good at,” said Norris. “They’re looking at creating the applications and deployments that really impact the business. It’s beyond the experimental phase and really looking at how do we create significant value.

“What we’ve seen is increasingly it’s not about a separate silo of database operations. It’s more about doing that in conjunction with Hadoop. I’ve got all this unstructured data coming in, and I want to impact the business and do that in a real-time way.”

Thus, the true promise of NoSQL—whether standalone or inside Hadoop (as it is with HBase)—is to make the data more accessible to everyone involved. Without the need for painful data migrations and transitions, analysts can get to their answers faster and with less of a headache.

“The environments are complex and hybrid,” said FairCom’s de Oliveira, “so most of the time they have different technologies. There’s a lot going on in what we used to call ETL, but it’s been a little more sophisticated than that, because the transform process is changing and consolidating data. It has a lot to do with Internet of Things: There’s information being generated that needs to be stored somewhere. There are so many different types of data that need to be processed. It’s about having records of those transactions.

“We are sourced, but also the target of these consolidations. Sometimes they use us to consolidate this data. For example, they load C-Tree with all the stock market data and sell it to traders in real-time.”

And Hadoop environments are a big part of that ETL revolution. Thanks to Apache HBase, for example, even relational data stores of information can be stored in Hadoop for processing.

“We’re seeing a huge change as companies look at their data centers and the rate of data ingress, their new data sources, and the applications they’re required to push out,” said Norris. “They have to replicate and do separate ETL activities, and they want to always be agile. Bring those together and it’s a new platform where you land the data and perform operations directly on it. We’ve seen great strides, but we’ll continue to see huge changes and you’ll continue to see MapR lead with innovations.”

How is your database different?
“As a key-value store database, we provide all the interfaces for applications that need to handle the data under the key-value store model. We handle data without the defined schema. It’s a very flexible schema for data.

For the document databases, they have a schema-less model because everything’s a document. The key-value pair is different than that because the key-value pair is stored in a schema-less way and you use indexes overtop of it. The difference is being able to support transactions. We support full transactions. We are ACID-compliant, and we’ve always been ACID-compliant.

C-Tree is used in payment systems. These customers never really cared about multiple tables and querying, all they care about is the SLA of the credit card transaction.

We’re used in a typical key-value store situation, where the application needs to be fast enough to find the info and needs to be flexible enough to have multiple schema types. This is typical for NoSQL scenarios.”

— Evaldo H. de Oliveira, Director of Business Development for FairCom

How is your Hadoop different?
“We’re also seeing if you’ve got a Hadoop you can trust, a platform that’s available and has full business continuity, then instead of moving data around, you’re landing it and doing applications on top of that. Typically, you’re seeing companies doing a wide variety of applications. Eighteen percent of our customers have 50 or more applications running on a single cluster.

You’re basically pulling together data and doing different analytics directly on that. In many cases, it’s because you don’t have the time to ship it off to another system and load it in and do transformations. That’s great if you are doing reporting on what happened last week or yesterday in the business, but if you’re trying to impact business as it happens, that’s a dramatic change. That’s what we’re seeing with ad media who do 100 billion ad auctions a day. They’re making adjustments while they’re happening. If it’s fraud detection, deciding while the credit card swipe is taking place, ‘Is this fraudulent activity?’

When you’ve got the customer on the website and you’re deciding what to show them and what product to recommend, or what additional info to make available to them. That’s where you basically have to have capabilities that provide that consistent low latency. Latency is a big issue when you’re looking at integrating in Hadoop. We’ve focused on that for some time. From the lowest level of architecture up through the stack, we’ve made different product enhancements to provide consistent low latency.”

— Jack Norris, CMO of MapR

What about HBase?
“HBase is the standard NoSQL option with Hadoop,” said MapR’s Norris. “It has scalability advantages beyond some of the other NoSQLs. It is architected to work inside Hadoop. Where we’ve invested has been in optimizations and making sure HBase applications can run on an enterprise-grade NoSQL option: the MapRDB. It provides consistent low latency, eliminated Java compactions, [and] eliminated downtimes. We have customers that have used that in a variety of applications. It’s in cable TV ad insertions, optimizing across 50 million set-top boxes.

We have global table replication, and that is a big advantage. Eventual consistency really limits the type of applications and type of data you would trust using a NoSQL model. MapRDB supports enterprise grade, mission critical applications.”

A Buyers Guide to NoSQL
Amazon: Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models. Its flexible data model and reliable performance make it a great fit for mobile, Web, gaming, ad technology, IoT, and many other applications.

The Apache Software Foundation: Apache Accumulo is a key-value store that provides a robust, scalable, high-performance data storage and retrieval system. Apache Cassandra is a highly scalable, high-availability, high-performance wide column store also based on Google’s BigTable design. Apache CouchDB is a document database that uses JSON for documents, JavaScript for Map/Reduce queries, and regular HTTP for an API. Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. Apache HBase is a Hadoop database that provides random, real-time read/write access to Big Data, and it’s designed to host very large tables consisting of billions of rows and millions of columns using clusters of commodity hardware.

Basho Technologies: Riak KV is a NoSQL open-source database that uses a key-value model. It combines operational simplicity with high availability, scalability, and fault tolerance. Riak KV can be used to store text, images, documents, user and session data, log files, etc. Riak KV Enterprise includes multi-cluster replication ensuring low latency and robust business continuity.

BrightstarDB: BrightstarDB is a fast, embeddable NoSQL database for .NET. It is cross-platform and open source under an MIT license, supporting Linux, OS X, iOS and Android as well as Windows. It provides a code-first entity framework enabling developers to benefit from its schema-free triple-store while still using strongly typed objects and LINQ in their applications.

Cloudera: Cloudera Enterprise is the first unified platform for Big Data, powered by the world’s most popular Hadoop distribution. It includes the most powerful open-source frameworks, including Apache Spark and Impala for the fastest analytics. Cloudera Enterprise is designed specifically for mission-critical, production environments, with the simplest administration, compliance-ready security, and comprehensive data management.

Couchbase: Couchbase Server is a high-performance, open-source distributed NoSQL database for building Web, mobile and IoT applications. Couchbase Server can be deployed as a document database, a key value store or a distributed cache. It scales across commodity hardware to support massive data sets with a high number of concurrent reads and writes while maintaining low latency and strong consistency. Couchbase Server 4.0 will include the N1QL query language that extends SQL to JSON, enabling developers with existing SQL expertise to easily write applications on a NoSQL database.

DataStax: DataStax delivers Apache Cassandra in a database platform that meets the performance and availability demands of Internet of Things, Web and mobile applications. It gives organizations a secure, fast, always-on database technology that remains operationally simple when scaled in a single data center or across multiple data centers and clouds. DataStax offers enterprise-grade capabilities such as search, analytics, in-memory computing, advanced security, automated management services, and visual management and monitoring, among others.

FairCom: c-treeACE is a fully ACID key-value database that supports multiple relational and non-relational APIs. Its unique No+SQL technology facilitates high-performance NoSQL and industry-standard SQL access within the same application over the same data. Flexible schema records, customizable data types, and high-speed indexing, as well as Stored Procedures, user-defined functions, triggers, and full transaction support, make c-treeACE an ideal NoSQL database for mission-critical applications.

MapR: MapR-DB is an enterprise-grade, high-performance, in-Hadoop NoSQL database-management system. It lets you run Apache HBase applications with higher performance and reliability. MapR-DB delivers the speed, scalability and flexibility needed for today’s Big Data environments. It is integrated into the MapR Distribution to support running operational and analytical workloads in the same cluster. In addition to MapR-DB, MapR supports Apache HBase. MapR-DB is available in MapR Enterprise Database Edition and for unlimited production use in the freely downloadable MapR Community Edition.

MarkLogic: MarkLogic is the only Enterprise NoSQL database platform with the flexibility, scalability and agility of NoSQL combined with enterprise-hardened features like ACID transactions, high availability and failover, disaster recovery, government-grade security, full-text search, semantics, and schema-agnostic data modeling. MarkLogic improves time to market by making it easy for developers to implement new business logic, meeting or even exceeding business objectives.

MemcacheDB: MemcacheDB is a distributed key-value storage and retrieval system designed for persistence. It is not a cache solution. MemcacheDB conforms to Memcache protocol, so any Memcached client can connect to it. MemcacheDB uses Berkeley DB as a storage back end to support features such as transaction and replication.

MongoDB: MongoDB is a next-generation database that helps businesses transform their industries by harnessing the power of data. Companies use MongoDB to create applications never before possible at a fraction of the cost of legacy databases. Specifically, MongoDB stores data in JSON documents, taking advantage of JSON’s seamless mapping to native programming language types and dynamic schema so data models can evolve easily compared to relational databases. It scales linearly and processes queries much faster than a relational database. In addition, MongoDB is easy to install, configure, maintain and use.

Neo Technology: Neo4j helps businesses create new products and services by bringing data relationships to the fore. Neo4j combines a native graph property model with ACID transactions, making it crucial for applications in master data management, IT operations, fraud detection, real-time recommendations and graph-based search tools.

NuoDB: NuoDB is a fully ACID-transactional SQL DBMS, but architected for the cloud, with high-speed elastic scale-out and scale-in. NuoDB operates a distributed in-memory cache, with continuous availability supported by arbitrary levels of redundant distributed persistence. With its unique, patented architecture, NuoDB offers global transactional consistency across a globally distributed database.

Objectivity: InfiniteGraph is a graph database that enables organizations to ask deeper, more complex questions across new and existing data stores by traversing complex relationships requiring multiple hops across vast and distributed data stores. The latest version improves search results and provides faster ingest performance. In addition, the Visualizer has been enhanced for visualizing and navigating the graph, and navigation policies can be saved in the graph for later reuse.

Oracle: Oracle NoSQL is a distributed key-value database designed to provide highly reliable, scalable and available data storage across a configurable set of systems that function as storage nodes. It provides a powerful and flexible transaction model that greatly simplifies developing a NoSQL-based application. Oracle NoSQL scales horizontally with high availability and transparent load balancing even when dynamically adding new capacity.

Pivotal: GemFire/Geode is a NoSQL in-memory database for extreme-scale applications providing advanced database capabilities in extreme low-latency and high-concurrency environments for custom applications. GemFire can uniquely support globally distributed environments, massive client/user queries, in-memory distributed functions and scales linearly to support any transactional application load. Redis is an open source, BSD-licensed, advanced key-value data store. It is often called a data structure service since keys can contain strings, hashes, lists, sets, and sorted sets. Redis works with an in-memory data set to speed up performance, and it supports master-slave replication. It includes many other features such as transactions, pub/sub, Lua scripting, time-limited keys, and configuration settings that allow Redis to behave like a cache.

Article Tags

Amazon, Apache, Basho, Big Data, BrightstarDB, Cloudera, Couchbase, databases, DataStax, FairCom, Hadoop, HBase, MapR, MarkLogic, MemcacheDB, MongoDB, Neo Technology, NoSQL, NuoDB, Objectivity, Oracle, Pivotal, SQL

About Alex Handy

Alex Handy is the Senior Editor of Software Development Times.

View all posts by Alex Handy

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

NoSQL: It really is all in the name

Article Tags

Subscribe to SDTimes

About Alex Handy

Related Articles

Oracle releases Java26, with new Java Verified Portfolio

MySQL community calls for Oracle to establish a foundation to ensure project’s future

This week in AI updates: Anthropic acquires Bun, GPT-5.1-Codex-Max added to API, and more (December 5, 2025)

AWS announces enhanced event processing in Lambda