“We have complex models that contain varying amounts of information and varying types,” said Madison Logic’s CTO Mark Hershberg. “Because we’re always investigating new data sources to improve our models, we can’t be certain what data fields may be part of the model 18 months from now. Traditional databases don’t have [that] flexibility or in many cases the ability to scale to handle the models that we need. While column-based and document-based NoSQL databases made the most sense, we felt the document model provided the right amount of flexibility.”
Wide column stores
Wide column stores (a.k.a. column stores or columnar stores) are optimized for queries that span multiple columns of data (super columns) or multiple columns of related data (column families). Like relational databases, wide column stores organize data in rows and columns; however, the column orientation is better suited to certain types of queries. Like document stores, column stores share attributes with key-value stores.
“Wide column stores are particularly good at reducing the amount of disk seeks required for retrieving data,” said LexisNexis’ Villanustre. “Unfortunately, they are not very efficient when most of the columns are required as part of the candidate set.”
Wide column stores are better at handling sparse data than relational databases. For example, if a user profile contains a first name, last name and interests, some users may choose not to specify their interests. A relational database would require “no value” to be specified and the processing logic to handle that; a column store just accommodates the variance.
“One of the key advantages of columnar stores is the way you can store the information,” said Avalon Consulting’s Cagle. “You can compact information and deal with more sparse content. You gain flexibility, but it comes at a cost of less performance, and that’s the big trade-off with all of these [NoSQL databases]. There are very few that offer high performance, high scalability and high flexibility.”
Graph stores are used to explain relationships. Some graph stores use adjacency nodes where one node points to another; triple stores store graphs as subject-property-object. Graph stores are commonly used for network diagrams and social graphs.
“Graph databases tend to be optimized for referentiality. The problem is they’re a lot like assembly language once you start defining everything,” said Cagle. “A triple store is a way of representing an assertion so you can essentially build into these structures a data model between classes of things rather than just between instances.”
Wargaming.net is currently evaluating graph databases with the goal of understanding all of the relationships in its games.
“Who you play with is very important in online games,” said Craig Fryar, head of global business intelligence and the global business intelligence data engineering team at Wargaming.net. “Using [the graph database], we define relationships between players in the different ways they can interact with each other from platoon and clan membership, to chatting with and shooting at each other.”
Wargaming.net uses a combination of NoSQL database types, as does Zephyr Health. The company uses a document database, a graph database, a cache, a relational database, and a NoSQL database service to accommodate its wide range of use cases.