Mobile devices and the move to the cloud have combined to form problems for which traditional SQL databases are unsuited. This is why the NoSQL movement has sprouted dozens of newly crafted databases for every imaginable problem.
For the Apache Foundation’s CouchDB, replication and stability were the first priorities. On July 14, four years of work culminated in the release of Apache CouchDB 1.0, bringing with it the security features needed for enterprise applications.
Damien Katz created the CouchDB project after leaving the Lotus Notes team. He used his Notes knowledge to build a document database that could perform peer-based replication, along with a ground-up approach to tolerating node failure.
Version 1.0 puts the finishing touches on CouchDB’s underpinnings, said Katz. Key to the purpose of the database is replication on a reliable and grand scale. Any node of a CouchDB cluster can be written to, and those changes will automatically trickle out to the other nodes, even if some of those other nodes are turned off at the time. And it doesn’t matter how those nodes were turned off; Katz said that CouchDB is built to crash.
That’s because the only way to turn off a CouchDB instance is with the Unix “kill” command. This is actually what the code does itself when a CouchDB instance is told to turn off. Katz claimed that, because CouchDB is designed to suddenly stop running, it can never corrupt the data it stores. It may sound unorthodox, but using a crash as the standard termination means there may be no way to surprise CouchDB.
“The replication stuff is the killer feature of CouchDB,” said Katz. “A lot of databases have some sort of replication capability, almost always master/slave, so reads can be spread across servers to reduce the read load. CouchDB uses peer-based replication, so any update can happen on any node and automatically replicate out. We have the ability to take a database offline so it’s not connected to its other replicas, individually query it, then push that back to the replicas. It’s fairly unique in the database world.”
Going without schemas
Katz has since founded Couchio, a company based in downtown Oakland and tasked with the creation of office productivity software on top of CouchDB. Along the way, he’s seen the project used in a number of interesting new ways.
“There are lots of businesses building on top of CouchDB,” said Katz. “Some use it for analytics, some use it for large content management.”
No matter what the database is used for, the fact that it does not use a schema saves developers time and energy, he said.
“We’re schema-less. We have the ability to add new types of data into an existing database without having to change the schema,” said Katz.
That helped when a user decided to build a real estate application on top of CouchDB. Because each real estate territory is different, the information held on each listing is also different. Florida homes might list the status of a house’s air conditioning or boat docks, while a house in Kansas might be listed with its distance to the nearest tornado shelter.
In a traditional schema-based database, each separate item would require its own definition and column in the database. That means more work for the DBA every time a new region’s information is poured into the database. But CouchDB, said Katz, can handle new data and new data types at any time.
For the future, Katz’s Couchio and the contributors at the Apache Foundation are pushing CouchDB onto mobile devices. Katz said there is already a compiler for the Android platform, and that the iPhone will be next, followed by BlackBerry.
As for the project’s status at the Apache Foundation, Katz said that the non-profit is a key part of the overall CouchDB plan. Specifically, the Apache Foundation serves as a buffer against patent trolls, he said.
“Apache is fantastic. In general, one of the biggest benefits Apache gives us is legal protection,” said Katz.
“There have been lawsuits from a company called Visto, which has sued some other mobile companies about similar sync and replication technology. But being a part of Apache, that makes it really hard for us to get sued. The Apache Foundation is really about making sure these projects are safe.”