There has been a lot of emphasis over the last few years about breaking down silos. This includes siloed departments, people, and even applications within a business. But one area that has historically been very siloed is data.
Databases take in data with a specific schema, but with all of the interconnectedness that is required today, odds are data will be coming in with varied schemas.
Martin Fowler’s idea of “polyglot persistence” was a popular one for a while. The term refers to the practice of having multiple databases that each specialize in a particular type of use case, Sanjeev Mohan, analyst at Garner, explained. “But to be honest, no organization wants to be managing dozens of different databases, [all with] unique skillsets,” he said.
This is where multi-model databases come in.
RELATED CONTENT: Is the Hadoop party over?
According to Ken Krupa, VP of global solutions engineering at MarkLogic, a multi-model database is exactly what it sounds like: A database that allows for multiple ways of modeling data. Krupa describes multi-model databases as having a self-describing schema. As data comes in, there is a potential new schema that will come with it.
Krupa emphasized that an important aspect of multi-model databases is that they integrate data from silos. Hesaidflects that for a long time, people were trained to think that when there was a need to integrate data from multiple schemas, the answer was to create a new model that would fit all of the other models. “Our definition of multi-model says no, all of those models are important, and you want to maintain them all,” he said. “You want to harmonize them in a way as opposed to trying to mash them into a single model.”
Krupa went on to explain that there is a second part of the definition of multi-model databases, one that is from the business perspective. In addition to taking in different types of data from a technical standpoint, they can also take in opinions and perspectives from different silos, he explained. According to Krupa, these different opinions and perspectives may all be valid and may all matter, so “you shouldn’t throw them away and try to beat them into some uber model because ultimately they fail.”
The main benefit of multi-model databases, according to Krupa, is agility. They provide a better toolkit of representing the “shape of a business entity.” With multi-model, data scientists don’t have to try to fit a complex hierarchy into rows and columns. “You can represent the fullness of that business entity without compromise,” he said.
Another benefit of that agility is that change is more acceptable. “There’s a richness in capturing the fullness of the business entity that’s closer to its real world definition, and the agility to handle change,” said Krupa. “That sounds cliche, but that’s the only constant — that change is frequent.”
Krupa added that when combining data from multiple sources, change has a multiplicative, rather than an additive, effect. “It’s not just a summation of all their changes, but it’s really kind of multiplicative effect of this change times that change times that change,” said Krupa. “All you’re doing is dealing with change and uncertainty and you need something flexible to deal with that in a way such that you can kind of embrace the chaos and still get value out of the data and be able to deliver things quickly.”
So with these benefits, why isn’t everyone using this model? According to Mohan, they’re not quite as good as the “best of breed” databases. But they do solve a number of integration challenges that data teams face, which makes them very attractive, he explained.
Mohan believes a big contributor to their popularity is the fact that they do solve the challenge of integrating multiple skillsets. Rather than having data scientists that only specialize in one type of database model, you have a whole team that is working together on a common model.
Krupa believes that multi-model databases aren’t a fad that will go away. He said he first saw the term being used five or six years ago, and since then it has become a more commonly used term. “I think it’s going to become the norm because what organizations always wanted from a database is a sense of universality and ability to store any kind of data. And the only way we’re going to accomplish that is if we have a database engine that can handle the different ways to do that,” he said.
Mohan believes that while it will continue to get more popular in the next few years, adoption will be based on use cases. For example, if you have a point-of-sale system with well structured data, then you probably don’t need a multi-model database, but if you’re taking in streaming data, log data, and IoT data and are trying to handle it in the least complex way, then multi-model might make more sense.
“There will be some types of databases that are not ideally suited for multi-model and they may stay separate, such as graph databases, which are usually a different structure and it’s not easy to link graph databases with relational databases. But the movement towards multi-model is only going to increase,” said Mohan.