Dunning talked about the interplay between the “great leaps” indicative of modern open-source development and the slow-and-steady progress of taking continuous steps to improve a piece of software. In ultimately getting an open-source project ready for enterprise adoption, he stressed an exacting emphasis on adherence to standards and licenses.
“Part of Apache’s core mission is making software that’s safe for restricted business environments, which means we as an organization pay huge attention to licensing hygiene,” Dunning said. “That’s often the furthest thing from an excited developer’s mind. It’s really important for building a community around commercial adoption. One of the key risks of open source is not knowing where that code came from. With Apache there’s traceability of every line of code and whose responsibility that piece of code was, and it’s a concern that needs to be met for projects that want to be in the big time.”
Incubating Big Data’s future
Dunning said his role in the incubator is one of facilitation, not control. He stressed that the ASF serves primarily as an open-source charity rather than a corporation. The ASF organizational structure rotates as well, so he knows his position heading up the incubator is not a permanent gig.
“It’s an opportunity to contribute in a new kind of way,” said Dunning. “It’s been a long and interesting ride, and it’s exciting to see how [open source] has progressed. Admittedly it’s much, much easier now because of acceptance and the communication we have now to work on open source.”
Going forward, Dunning has many ideas about where the ASF can expand grow through the Apache Incubator. One of his most active goals is extending the open-source Big Data communities farther into Europe and Asia.
“Kylin is a project that started in China’s eBay facilities and it’s now in Apache. There’s a cultural gap to be sure, but there’s huge enthusiasm around embracing open source,” said Dunning.
“The SINGA project originally out of Singapore University that deals with neural networks and deep learning was pushed into Apache and is now a very competitive machine learning project. Tajo is another project out of an Asian development group showing the same trend. It makes the world a lot bigger.”
Dunning also drew attention to the Incubator’s growing focus on integration-oriented projects such as Apache Zeppelin, which he said is breaking ground in providing visualization across different modes of computation. Finally, he mentioned a collection of science-related U.S. government projects including research from NASA’s Jet Propulsion Laboratory (JPL) coming into Apache as open-source projects.
While explaining his vision for the different paths the open-source community around Apache projects might take, that sense of wide-eyed excitement rung throughout. Dunning said his scientific fascination with patterns led him into Big Data, and it’s what motivates him to keep up.
“There are always funny contradictions in life,” said Dunning. “Because things change so quickly, no one really has more than five years of applicable experience in the Big Data world. But on the other hand some old fogeys—which I try not to be—complain about reinvention. Supercomputer guys complain about microprocessor people reinventing optimization techniques. Database people complain about Big Data people. I don’t see it so much as reinvention as this fascinating and joyful realization of patterns occurring over time and across domains. I find the fact that the world does exhibit order; exhibit patterns, as just wondrous.”