Jay Kreps, the creator of Kafka, gave the keynote yesterday at the first Kafka Summit in San Francisco, where attendees learned about this Apache real-time stream processing platform and how fast it is growing.
Originally built by Kreps at LinkedIn, Kafka has grown to become one of the most popular stream processing platforms out there. Currently, the project has more than 170 contributors, and is used by companies like Comcast, Goldman Sachs and Uber.
(Related: Heroku Kafka enters early access)
Kreps, cofounder of Kafka company Confluent, said that the project has been riding a wave that’s brought stream processing to mainstream computer science. “I think the big takeaway for me has been that this area of stream processing, which has been almost kind of academic and advanced, has really gone mainstream across a bunch of different industries,” he said.
“What that means for Kafka is making sure it continues to fulfill the kind of requirements for a system like that from security to resilience scalability.”
One topic of discussion at the summit was the quick pace of innovation within Kafka. The project was created in 2011, but it’s still maturing, attendees noted. Kreps admitted as much, but said that the velocity of the project has not come at the expense of compatibility.
“I think that’s a fair point,” he said. “This area is moving fast, technically, as a whole community of people figure out how to do this right. One of the things I think we have done right is being very strict about compatibility. The stuff you built three years ago…you don’t have to change it. Even though we may add new features and it may be hard to track what’s happening, things that were already built continue to work.”
With stream processing quickly becoming a popular new buzzword, Kreps said he feels that this might be the new wave in Big Data. “It’s certainly very similar to the Big Data dynamic. People realize there’s something happening and people realize it’s broadly applicable, and a bunch of people jump in.
“Most of those things work with Kafka if you go through and take the set of things that are supposed to be real-time technologies; Kafka is enabling technology that gets data around to systems. In a lot of ways, it’s the base of this streaming platform. We’re at this tipping point where companies have become these digital entireties. They are going back and taking all the systems they have and turning them into some sensible architecture. The data architecture makes sense if it’s built around data streams.”