This week’s QCon conference took on a range of computer science topics, such as machine learning, parsing and testing. Keynote speakers included security expert Bruce Schneier, LaTeX and Paxos developer Leslie Lamport, and Melody Meckfessel, a Google engineering director.
Jerome Petazzoni, senior engineer at Docker, gave a talk yesterday that laid out the benefits of Linux containers. Titled “Containerization: More than the new virtualization,” Petazzoni explained that the benefits of Linux containers extend beyond the ability to host multiple applications on a single server.
“Containers are just processes isolated from each other,” he said. “When I start a container, I am starting a normal process, but it has a stamp that says it belongs to a container. That extra stamp is very similar to the User ID you have in a process. It can belong to root, or UID 1000.”
Petazzoni added that in sharing information between containers, the path to communication is much simpler than when using virtual machines. Virtual machines often have to use networking in order to communicate between VMs even when they are hosted on the same machine.
Because containers use fine-grained namespaces, Petazzoni said that individual containers can be isolated, or they can share the same namespace as other containers, and thus be allowed to communicate across the namespace.
Iteration is the key to machine learning
Daniel Tunkelang, director of engineering for search quality at LinkedIn, laid out a data science approach to developing machine-learning algorithms. In his talk, he advocated building machine learning with traceable and explainable building blocks first, then optimizing later.
As with traditional software development, building machine learning algorithms is an iterative process, said Tunkelang. He added that building a complex system from the start will leave you with only one way to measure its effectiveness: the accuracy of the data that comes out.
“Accuracy gives you a very coarse way of evaluating an algorithm,” he said. “It’s very much like debugging code. I’ve gotten a lot of value from linear regression and decision trees. The nice thing about these is that they very clearly favor explainability. The most valuable thing about explainability is that you don’t have to entirely trust your training data if you can debug in this way. But if you have a black box approach, the only indicator you get is that it’s not as accurate as you’d like.”
Tunkelang compared designing machine learning algorithms to kissing frogs in search of a prince. Thus, the solution is to kiss lots of frogs at a much faster rate, not to keep kissing the same frog in new and interesting ways.
Keeping parsing simple
Terence Parr, professor of computer science at the University of San Francisco, enlightened QCon attendees about the power of his parsing research. Parr has devoted his research to increasing the strength of the simple and efficient top-down LL parsers, culminating in the powerful ALL(*) strategy of ANTLR 4.
“I’ve made this as powerful as possible, and I’m trading that last bit of generality for performance,” he said.
Parr’s ANTLR parser optimizes itself after a successful parse, and given the same file a second time, it can parse it significantly faster than more generalized parsers. Parr said he’s finally completed his work on parsing science after 25 years of work, and he’s not quite sure what he’ll be working on next.
Testing at Android
Tim Rath, principal engineer at Amazon (where he is part of the technical leadership team for Amazon Web Services), discussed testing strategies used at the company. He echoed keynote speaker Leslie Lamport by admonishing developers in the audience to write specifications for their code.
“Every class I turn in has to have some specification to it,” he said. “You should be writing things that you know about in the code: How does this interact with other parts of the system? What is its purpose in life? I will put the interaction diagrams in there. Ultimately, I am looking for those truths that I can test and describe, and describe them well.”