It’s been eight years since simultaneous multithreading first appeared in popular processors, when Intel shipped it under the name of Hyper-Threading Technology. Since then, numerous companies have been trying to figure out how to get developers to leverage multiple threads on the desktop, whether in a single CPU or in the now-common multicore processor.

If you have paid no attention to this area during these intervening eight years, you have missed almost nothing—a rare assessment of any technology sector’s progress. Numerous attempts to make multiple threads easier for developers to tame met little interest and even less traction. As a result, only general directions for future advances have come into focus.

There is wide agreement on the limitations of the traditional mainline approach of manual thread management, which relies on a panoply of burdensome techniques: individual threads, mutual exclusion, locks, signals and so forth. These tools work (they are still the primary tool box), but they create situations that are difficult to develop for and, at times, nearly impossible to debug.

One of the problems in debugging is the inability to recreate the exact state of the machine necessary to duplicate a bug. As a result, it’s also difficult to do exhaustive testing of threaded code to ensure that no undesirable interactions among threads are occurring.

Intel is probably the vendor that has most worked in this area, by delivering a set of tools that can examine code, look for places where threads might interact incorrectly, and flag them for the developer. It also sells a C++ platform that diminishes some of the hard manual effort in coding. In addition, Intel offers a thread-safe C++ template library, the Intel Threading Building Blocks, that provides high-level functions for thread management.

Earlier, Intel was an active supporter of OpenMP, a multivendor-sponsored library with C, C++ and Fortran interfaces that very effectively hide thread management. In its principal usage model (although other approaches are widely used), developers mark off parts of their code to be run in parallel, and OpenMP executes them in parallel in the background without further involvement by the developer’s code. While OpenMP takes away some of the headaches, its benefit occurs in programs that alternate between serial and parallel sections. Programs that are mostly parallel, or that are decomposed along functional lines, get less lift from OpenMP.

An entirely different approach (using actors), which I discussed earlier this year, might be a solution. Its core concept is that individual functions are wrapped in an actor, which can do only the following steps: receive a message, send a message, or create another actor. Hence, data is shipped to the function for work and the results shipped to another actor. The messages flowing in and out of actors are all immutable. To change data is to recreate it. (Think of Java strings as an example of this approach in mainstream programming.)

Moreover, actors are designed so that their actions cannot affect another actor—save via sending messages. This model avoids much of the hassle of multithreading—as actors never interfere with each other—but it requires a different form of thinking. The model’s attraction has yet to woo the core of programmers, who are not used to thinking of data as immutable nor of data motion being done via messages within the program.

The final alternative is one that has fervent admirers: functional programming. In the functional approach (stripped of its academic terminology), parts of the actor model are pursued to their logical end. Most functions have no side effects, functions can be passed around as objects, most variables are actually constants, and so are immutable. Functional languages strain to create a level of purity in execution that limits the scope of computation in known ways. It is well suited to parallel programming, but is unlikely to gain traction. It is currently so overloaded with academic overlays, and the body of functional code for business purposes is so small, that it’s hard to see how this area, which has been active for decades, would suddenly break out into the mainstream.

Clearly, no single approach is going to win over developers of client-side and small-scale server applications. Personally, I think the actor model is an excellent option, because threading can be added incrementally, and the core concepts are easy to learn and apply. However, the ultimate solution will come when developers in school begin to treat all problems as multithreaded, and use that paradigm as the starting point of their design and implementation. In other words, parallel is unequivocally going to require a mind shift if it is ever to enjoy wide adoption.

Until then, only software from ISVs will be able to soak up the four or more cores available on today’s client devices.

Andrew Binstock is the principal analyst at Pacific Data Works. Read his blog at binstock.blogspot.com.