When Damien Katz created CouchDB in 2005, he was trying to build a database that could support the next generation of Lotus Notes. But along the way, he and CouchDB became a part of the swelling tide of NoSQL databases, like surfers catching a wave. But after six years of work on the project, Katz announced that he would be stepping away from CouchDB to work on the fork currently used in Couchbase.
“I’ve been working on it for so long. I really like working and solving problems and building useful software. It’s what I’ve always done and will continue to do,” said Katz.
As 2011 ended, Katz’s CouchDB startup Couchio folded into Memcached-based NoSQL startup Membase, forming the newly named Couchbase company and project. The resulting database includes the upfront caching layer of Memcached combined with the cold storage-style replicated back end of CouchDB.
Going forward, Katz said that his own focus will now be on Couchbase, and the rewriting of some portions of that database in C and C++. He originally created CouchDB in Erlang. His coworkers at Couchbase, however, will continue to contribute their changes back to Apache, he said.
And that’s a good thing, because much of CouchDB is still written by Couchbase employees. But they are not the only ones with skin in the Apache CouchDB game. Cloudant has bet heavily on Apache CouchDB, and is preparing to contribute its scalability and global replication suite—dubbed BigCouch—to the Apache Foundation.
Adam Kocoloski is the founder and CTO of Cloudant. He’s also one of Apache CouchDB’s biggest committers and proponents. And while Couchbase is focused on services and consulting for its titular NoSQL database, Kocoloski’s company offers a hosted CouchDB cloud database service.
Kocoloski said that Katz’s views on Couchbase are mostly in line with Apache CouchDB’s, but that Katz had already been moving away from the Apache side of the project for some time. “Damien did talk about the need to move things to C. If you look at the developers of CouchDB over the last year, you see the same things happen: One of the things that happened in the Erlang community—I think in release R14—they made it cleaner and easier to interact with C,” he said.
“In the performance of any big piece of Erlang software, there are going to be areas that are better written in C. We can disagree about how much of the networking stack needs to be in C, and to the best of my knowledge, Membase still uses Erlang extensively for that… I don’t think Apache CouchDB is all that different.”
Katz echoed that sentiment: “I think Erlang is a fantastic language. I learned a lot from it. I found it extremely useful, and it’s going to play a huge role in our product. But there are certain things it’s not great at, and for as long as I’ve been coding, C has been the king of speed and low-level control. When you really need the performance and the low-level control of machines, and the dollars customers are spending on the hardware or cloud computing matter, you need to do your performance-sensitive stuff in something like C.”
So it would seem that the work to revise CouchDB will now be taking place in two projects: Couchbase and Apache CouchDB. Katz is optimistic about this fork, which had already been in the making for months.
“I’d say forks are good things,” he said. “Linux is a huge ecosystem, and there are many, many forks. I see absolutely no problem with that. I hope other people fork what we work on, and I see that as a sign of a growing ecosystem.”
Kocoloski was less optimistic about the fork. “I worry about it. The Apache license allows for these sorts of things to happen. People can take the code and do with it as they please, and it certainly enabled us to go and hack on things privately for a while without any faults.
“But I do think reducing the number of forks and providing a more unified message about what Apache CouchDB is is a good thing. Even in the community there’s confusion about what’s what. People in IRC come in all the time confused. They say they had this version of CouchDB, but even though it’s the same version number, they actually have Couchbase, not CouchDB. It’s very, very obvious that there are differences in the products, and that the distributions that have sprung up have caused confusion in the community. I am looking forward to pruning off some of those forks.”