Integration Watch: The intractability of parallel programming

Andrew Binstock
September 1, 2010 —  (Page 1 of 2)
It’s been eight years since simultaneous multithreading first appeared in popular processors, when Intel shipped it under the name of Hyper-Threading Technology. Since then, numerous companies have been trying to figure out how to get developers to leverage multiple threads on the desktop, whether in a single CPU or in the now-common multicore processor.

If you have paid no attention to this area during these intervening eight years, you have missed almost nothing—a rare assessment of any technology sector’s progress. Numerous attempts to make multiple threads easier for developers to tame met little interest and even less traction. As a result, only general directions for future advances have come into focus.

There is wide agreement on the limitations of the traditional mainline approach of manual thread management, which relies on a panoply of burdensome techniques: individual threads, mutual exclusion, locks, signals and so forth. These tools work (they are still the primary tool box), but they create situations that are difficult to develop for and, at times, nearly impossible to debug.

One of the problems in debugging is the inability to recreate the exact state of the machine necessary to duplicate a bug. As a result, it’s also difficult to do exhaustive testing of threaded code to ensure that no undesirable interactions among threads are occurring.

Intel is probably the vendor that has most worked in this area, by delivering a set of tools that can examine code, look for places where threads might interact incorrectly, and flag them for the developer. It also sells a C++ platform that diminishes some of the hard manual effort in coding. In addition, Intel offers a thread-safe C++ template library, the Intel Threading Building Blocks, that provides high-level functions for thread management.

Earlier, Intel was an active supporter of OpenMP, a multivendor-sponsored library with C, C++ and Fortran interfaces that very effectively hide thread management. In its principal usage model (although other approaches are widely used), developers mark off parts of their code to be run in parallel, and OpenMP executes them in parallel in the background without further involvement by the developer’s code. While OpenMP takes away some of the headaches, its benefit occurs in programs that alternate between serial and parallel sections. Programs that are mostly parallel, or that are decomposed along functional lines, get less lift from OpenMP.

Related Search Term(s): parallel programming

Pages 1 2 

Share this link:


09/01/2010 09:04:00 AM EST

You failed to consider Software Transactional Memory as an alternate to tame the parallelism beast! I understand that MS has thrown in the towel on this route and not much progress has come out of either the Stanford Pervasive Parallelism Lab or U.C. Berkeley Parallel Computing Lab efforts. But, come on now, the actor model is quite limited and really only addresses the problem from a tactical perspective. Sandy

United StatesSandy Klausner

09/01/2010 05:52:41 PM EST

Interesting article. By now, it should be obvious to all that threads are not the answer to the parallel programming crisis. In fact, they are the problem. Threads are not only non-deterministic and prone to timing and conflict errors but the human mind has serious trouble making sense out of multiple threads running in parallel. The hard truth is that we will have to radically change to a new software model sooner or later and a major part of the solution is to do away with threads altogether. The problem is that the computer as we know it will not survive the next revolution. The future is not just non-algorithmic but the antiquated Turing machine model that the industry is so enamored with does not have a role to play in it. The baby boomer geeks have shot computing in the foot in the last century and we are now paying a very heavy price in terms of reliability and productivity. Google "How to Solve the Parallel Programming Crisis" if you're interested in the future of computing.

United StatesLouis Savain

09/04/2010 04:46:11 PM EST

I was wondering if you have looked at Erlang. This provide a good mixture of functional programming and actor based model. If you want to talk about main stream programming, Scala is the one. It has taken the concept from Erlang and implemented them over JVM.

United StatesYogish Baliga

09/08/2010 01:02:31 PM EST

I think your description is correct, though I see things a bit differently from your implied prescription. I'm not sure that there will be a monolithic, one-size-fits-all "solution" for the parallel programming problem. Rather, I suspect that the industry will gradually understand approximate (and probably overlapping) categories for which a given toolset or approach is appropriate. Oh, I'm sure that we'll have "parallel programming toolset wars" just like the browser wars, the OS wars and the language wars, but we'll reach some collective understanding of where each technology is most appropriate - practitioners of the art will need to have some knowledge of several approaches. --Kerry Jones, CTO, Postulate5 (

United StatesKerry Jones

09/15/2010 11:46:34 AM EST

The actor model has distinct similarities with data-flow architectures and data-flow analysis theory, dating from the 1970's to date. See C.A.R. Hoare's "communicating (asynchronous) sequential processes" paper and Tom DiMarco's book on design by dataflow. In particular, the independant "recieve/compute/send" model, the lack of need for explicit low-level synchronisation, and automatic concurrency, (ie: what the article's author appears to be lusting after), are provided by most data-flow architectures since the early 1980's. For an automatically (from the programmer's perspective) auto-synchronised data-flow-driven application framework, with fan-out and fan-in, see the POSIX/UNIX *process* I/O model (not the more recent POSIX *threads* model). I know there are times when sharing an address-space (ie: threads), in whole or in part, is more time-efficient than piping data packets between processes (which each have distinct address-space), but in many many cases the implied "copy" can be partially- (UNIX kernel) or fully- elided by the runtime system. Also, the cost of inter-address-space context switches is higher than for intra-address-space context-switches, and might be intolerable on 200-core machines with a shared globally-coherent memory-bus, but for the present is not an issue for most applications. In future, globally-coherent memory-bus system hardware architectures are likely to be in competition with explicit message-passing hardware architectures, for that and many other reasons. Summary: multi-threaded is not always the best way to get concurrency; multi-process with auto-synchronised data-flow is often easier and more reliable (and provides more reliable recovery from partial failure modes), and is considerably easier to distribute across a network. A huge number of developers have been (sometimes unwittingly!) been using such a concurrency model for decades already.

United KingdomMike Spooner

Code Watch: Functional programming: A sequence of FARTS
A new way of looking at functional programming is to break it down into this particular sequence Read More...

News on Monday  more>>
Android Developer News  more>>
SharePoint Tech Report  more>>
Big Data TechReport  more>>



Download Current Issue

Need Back Issues?

Want to subscribe?