While there is no solution to concurrency in sight, there are three programming abstractions, which have become near-universal building blocks for highly parallel systems: data-parallel loops, Actors, and Futures.
Data-parallel loops are the simplest parallelism that could possibly work. Essentially, they just farm out each loop iteration to a separate thread. This is an expensive operation compared to looping, but if the loop’s body is a performance bottleneck, the performance gained by distributing it across cores or processors can be close to linear (there will always be a little overhead). It’s important that each loop be independent, which may entail some refactoring of loop-carried dependencies, but, in general, there’s nothing not to like about this approach. Data-parallel looping constructs are universally available in mainstream languages.
Actors are objects that, conceptually, run in their own thread (conceptually because system-level threads are typically rather expensive to create and limited in availability, so Actors are usually implemented on top of a fairly complex thread pool provided by the language runtime). A normal Object embodies the only functions that can access that object’s non-public data, and that makes reasoning about the data’s meaning and validity easier than otherwise. An Actor embodies the only thread that can access that object’s non-public members, with similar benefits to reasoning about concurrency. This is so logical that when I first learned OOP, I was surprised to find out it wasn’t the norm.
The Actor has a function that loops, reacting to lightweight messages and posting them to the outside world. This is almost always referred to as a mailbox metaphor, but I like to call it a WinMain, just to provoke a reaction (unfortunately, it’s getting to the point where not everyone gets the reference). Deep thinkers argue about subtleties of the mailbox (particularly, whether its guarantees about message delivery and ordering are wise or foolish), but if you’ve programmed Windows in C (or a similar architecture), you know the pattern. You also know how it can break down, with hard-to-debug situations where the troublesome program state is encoded somewhere out there in a gazillion message queues.
Actors are a high-level solution to problems of consistency and correctness, but they do not solve the horrifying behavioral problems that arise in parallel programming: deadlocks, livelocks, priority inversion, etc. Erlang is the poster boy for the Actor model, and it has certainly garnered a lot of fans. I’ve voiced my skepticism about Actors to some Erlang programmers and, while acknowledging the theoretical validity of these concerns, I hear the exact same “But these don’t seem to be much of a problem in practice” phrase that fans of dynamic languages say regarding type-safety.
The third building block that has clearly established itself is the Future concept. A Future is a placeholder for a calculation: Instead of returning the results of an expensive operation, the function can quickly return a placeholder that says, “When you need this value, I promise to provide it, but for now, here’s my marker.” A newly created Future’s value is indeterminate, but there is the simple guarantee that a call to get() the Future’s value blocks until the value is available and returned. If there are a number of such Futures floating about, there is a good opportunity for a smart runtime to distribute the calculations efficiently. (Or not, as I imagined in my April Fools column for 2010.)
For instance, one variation of the Future concept is called pipelined Promises. Imagine that you have some resource that has a lot of access overhead (a database connection, perhaps). If your database queries are wrapped in Futures (or Promises), the runtime can do the costly infrastructure work of opening the connection and running all the pending Futures through the connection before closing it. (This is just an illustration of the concept. Of course, sharing database connections is already routine practice.)
Futures are a simple concept that are very straightforward to program against, and because Futures are essentially just a two-state state machine, they can be implemented as a generic class that wraps existing classes. Futures are also monadic, which makes them easy to reason about at a formal level, which helps when struggling with the notoriously slippery logic of guarantees in a concurrent context.
Even in combination, Futures, Actors and data-parallel loops cannot guarantee the absence of perverse bugs arising. There is no real solution for concurrency yet. But these building blocks can make programming concurrent systems much easier, and they are clearly advances over the locks-and-shared-memory model of concurrency that most of us have used in the past.
Larry O’Brien is a technology consultant, analyst and writer. Read his blog at www.knowing.net.