Zingg is a new open-source project that aims to solve some of the challenges of multiple data records existing for a single person. It offers deduplication of data and entity resolution.
“These records can be in single or multiple systems and they have variations across fields which makes it hard to combine them together, especially with growing data volumes. This hurts customer analytics – establishing lifetime value, loyalty programs or marketing channels is impossible when the base data is not linked,” the project’s maintainers explained on its GitHub page.
When AI algorithms are applied to the data, they’re unable to produce correct results when multiple copies of the same data are being passed through.
Zing offers the ability to handle any entity, connect disparate data sources, define domain-specific functions to improve matching, and scale to large volumes of data. It also has an interactive training data builder that can build accurate models from small training samples.
It can be used to build unified and trusted views of customers, integrate data silos during mergers and acquisitions, and enrich data from external sources.