To cheat or not to cheat. This is the central question for developers faced with an ever-growing number of processor cores in their servers. For the absolute best results, it’s long been understood that development managers and software architects needed to change practices and policies from the ground up. But virtualization makes possible another avenue, one that some would deem to be “cheating.”
Matt Lavallee, director of technology at Multiple Listing Service Property Information Network, said that virtualization allows applications that scale across a data center to also scale across a multicore server’s processor.
“Virtualization lets you cheat,” he said. “Instead of having to learn how to scale, you can run multiple nodes in parallel. That’s the approach we took with our Web servers. Instead of adapting and rewriting, we just run 10 times as many servers on one box.”
Cheating or not, it’s a quick fix for a difficult problem, and determining which approach is best suited to your application could be an even tougher proposition.
Behrooz Zahiri, product marketing manager at Coverity, said that choosing between multithreading and distributed computing is an application-specific affair.
“They both are forms of multitasking, but they are quite different,” he said. “Running on virtual machines is easier, but it usually provides suboptimal results. Sometimes an application is a good fit for the divide-and-conquer approach. If it doesn’t have a lot of dependencies, you can cut it into pieces and do some merging at the end.”
But Irv Badr, senior product manager at IBM Software’s Rational division, said that smart developers know where to let the operating system handle multicore, and when to take hold of threads themselves.
“It is advisable to defer the core affinity and runtime scheduling for multicore to the OS. In order to maximize gains from this revolutionary technology, the architect and systems engineers should own the top-level multicore architecture,” he said. “This allows them to be closer to the stakeholder and customers and control their multicore experiences.”
No more free rides
Bruce Wright, director of operations at search engine company Kosmix.com, said that the ever-increasing number of cores in server processors forces developers to stop relying on Moore’s law to improve performance.
“The clock-speed war we had in the last decade is over. It masked a lot of poor programming,” he said. “No matter what you programmed, it would get faster as clock speeds increased.
“If it will take advantage of multiple cores, sure you have to worry about locks (for concurrency control), but you end up getting a lot more work done” per computer.
Usman Muzaffar, vice president of product management at Electric Cloud, said that “there’s no shortcut around the fact that to take full advantage of a multicore architecture, you’ve got to be designed from the ground up to be that way.”
He added that there are two ways to approach multicore applications: “With an event model, or a thread model. Events are nice and simple, but they don’t take advantage of multiple cores.” The thread model, on the other hand, is somewhat more complicated to implement, he said.
“One architecture that is easy to use to deal with threads is a pool-based architecture,” said Muzaffar. “This is where you have a bunch of worker threads, each of which try to share as little state as possible. Each has to do their own thing. That’s the architecture I’ve seen most. But that’s a little like saying the solution to traffic problems is to drive safely: It’s easier said than done.”
Badr said that multicore development shines a light on many new concerns for developers.
“Currently, most platforms are either architected for sequential processing, or realize concurrency through multitasking, rather than true parallelism provided by a multicore processor. In both these scenarios, common concerns, such as core assignment (affinity), memory sharing, and cache management are not directly addressed.
“Consequently, the application is burdened with the implementing of the aforementioned features, which significantly increases its complexity; especially, that of the lower application layers that commonly interact with the multicore platform. This includes, to name a few, device managers, device drivers, Input/Output systems.”
Tough tests
No matter what approach you choose, Muzaffar said that multithreaded programming will always make things difficult when it comes to testing. “It’s a given that your software is going to have bugs. What makes multithreaded programming a nightmare is that the system is not deterministic anymore,” he said.
But there is a solution. Muzaffar said the team at Electric Cloud solved the problem by writing additional testing tools. “We wanted to come up with a framework, so once we identified a test case or a bug, we could replicate that problem. What you have to come up with is a way to force your threads to synchronize out of order. We wrote a little helper server; applications talk to that server and say, ‘Thread one, go here, now wait. Thread two, go here, wait; now thread one, do what you’re not supposed to do and crash.’ “
Coverity’s Zahiri said that moving to multicore doesn’t mean completely rewriting your application. It just means minding your threads and managing your resources properly.
“You do not need to completely rewrite your applications,” he said. “Java provides a lot of help already. There are a lot of safe, secure practices people use. It has proper access controls, and the order of operations is configured to avoid problems. It’s more about taking the extra step and making sure you synchronize those events and make them safe, as opposed to completely rearchitecting your application.”
Zahiri added that most of the problems he sees in his customers’ applications are related, specifically, to poor resource allocation.
He said the three biggest problems are “race conditions, deadlocks and resource leaks. Deadlocks are not that common. Race conditions are more common. But the biggest problem people run into, by far, is resource leakage. The result is typically performance degradation over time, making the program useless.”
Badr said that there is help out there, typically from the operating system. “Enterprise systems commonly defer multicore-related choices to the operating system. In Windows Environments, the OS performs many of the chores such as load balancing, inter-process communication and memory management, freeing the application from using core-affinity, cache and memory management, and communications APIs specific to multicore systems. It is when applications lose both control and monitoring of multicore application that negative gains and resultant anxiety occurs.
“People are doing a good job of making threads safe, but when we scan programs, we find there are issues with the way resources are being allocated and used. That ultimately results in the program being useless, but it starts with performance degradation.”
Hardcore future
Badr advises companies to look to model-driven development (MDD) to push their applications into a multicore world. “Model-driven development not only allows for refactoring legacy components into the final multicore application, but it also allows the architect to perform extensive tradeoff studies in different multicore configurations, converging on an optimum solution for the final deployment,” he said.
“Through MDD, the development team can easily refactor and redeploy a single-core legacy application onto multicore systems. At the same time, MDD allows the architect to perform tradeoff studies and what-if scenarios at the model level, hiding the underlying platform complexity and reducing much of the intimidation factor common with adopting new technologies.”
Muzaffar said that he expects multicore programming to get easier over time. He said abstractions will have to be created by those in the industry, and that these layers will help to simplify the creation of solutions.
“There was a time, not too long ago, when programmers had to be responsible for every little thing in the operating system. In time, those were abstracted away. Half of the innovation of Java was not having to deal with memory allocation,” said Muzaffar. He hopes similar advances will take thread management out of developers’ hands, as well.
Those helpers, such as Intel’s Threaded Building Blocks libraries, will be even more useful as multicore processors spread from servers and desktops into mobile devices and embedded appliances.
“Handheld devices will start embracing multicore architecture soon. I think we will have multicore phones in our hands by next year,” said Zahiri.