Transactional memory may be the most understandable of the potential “silver bullets” for the highly parallel manycore world we’ve entered. Everyone is familiar with the concept of database transactions, at least at the basic level of “if there’s a conflict, roll back to the boundary.” Many SD Times readers will be aware of how much more complex things can get with nested transactions, retries and guarantees, but nonetheless, the transaction concept is about as straightforward an idea as you have in the area of concurrency. “Transactional memory” applies the idea to the memory conflicts that arise in parallel programming.
The reason this has “silver bullet” potential is that, by far, the single biggest problem in parallel programming is shared, mutable state. Shared mutable state is to concurrency as raw pointers are to memory management, a capability that, yes, has performance benefits and that, yes, can be tamed if the context is controlled very tightly.
As with manual memory management, the problem is that context isn’t very tightly controlled in the real world, and libraries and hastily patched code are sometimes not airtight. And just as pointer bugs can be notoriously difficult to track down, appearing and disappearing with little predictability, so too with concurrency bugs. (If anything, concurrency bugs can be even more fickle and intermittent.)
It’s worth emphasizing that the problem with shared mutable state requires both conditions: a value that never changes that can be shared across threads in perfect safety once it’s been assigned, and everyone being familiar with variables that, within a single thread, change their values over time. (Functional programmers will argue that even within a single thread, one should prefer immutable values to variables, but that’s a discussion for another time.)
However, with shared mutable state, if you do not have higher-level constructs to enforce rules, you can never rely on a variable having the value you expect. Perhaps another thread came along and modified it, even if you just assigned the variable in the previous line of code. This is the same problem whether you’re talking about a screen coordinate in memory or a bank balance in a database.
The answer, in databases, is to add a higher-level construct, the transaction that delimits a logical work unit: “Begin here, end here, and guarantee that the variables I use in between are consistent.” Those with gray hairs will remember databases where such commands would lock the entire row or table; indeed, the performance and deadlock problems that arose are how many of us acquired some of those gray hairs. Today, it’s much more common for databases to use “optimistic locking.” With optimistic locking, the shared values that are to be read or written are tracked over the course of the transaction. If they have not been modified, the transaction succeeds.
The upside is much faster in the normal, conflict-free case. But there are two downsides: First, tracking variables is more complex and involves more overhead than a simple lock; and second, when conflicts arise, they are expensive (since you only discover them at the end, and to resolve them, you have to redo the transaction from the start). In practice, optimistic locking is a big win in most situations.
So, can this model that is successful with databases be applied to memory? Enter “Software Transactional Memory” (STM). It’s very appealing logically, but it’s smart to question whether what works in the world of databases will work equally well in the nanosecond world of a processor.
Just as STM seemed to be gaining considerable traction, a devastating paper cast doubt on whether STM could ever be more than “a research toy.” The authors, IBM’s Calin Cascaval and others, discussed an IBM implementation of STM that failed to perform well. They concluded that “the road ahead for STM is quite challenging. Lowering the overhead of STM to a point where it is generally appealing is a difficult task, and significantly better results have to be demonstrated… We observed that the TM programming model itself, whether implemented in hardware or software, introduces complexities that limit the expected productivity gains.”
In academic circles, that counts as a brutal smackdown, and it chilled the enthusiasm for STM that had been so palpable.
In early February of 2012, Intel announced that its 2013-era Haswell architecture will include hardware support for Transactional Memory. While the so-called TSX extensions are a long way from the high-productivity languages of mainstream development, Intel’s development of them is a strong vote of confidence in the concept of Transactional Memory. Most people following the topic have long acknowledged that the “S” in “Software Transactional Memory” would need hardware help, and Haswell will certainly spur more tries at high-performing implementations.
Whether Cascaval’s criticisms of the semantics of TM can be adequately addressed is still unknown, but I am hopeful. It seems to me that familiarity is half the battle when it comes to programming concepts. Debugging code that rolls back to a transaction boundary is going to be confusing at first, but so too are stack frames. It’s certainly a lot better than dealing with inconsistent state.
Larry O’Brien is a technology consultant, analyst and writer. Read his blog at www.knowing.net.