Code Watch: Cohesion: The forgotten virtue
March 20, 2012 —
(Page 1 of 2)
Related Search Term(s): cohesion
Coupling and cohesion are paired concepts of software quality. The basic ideas were laid out by Larry Constantine as long ago as 1968, and developed broad audiences with the rise of structured programming in the late 1970s. But "loosely coupled" is a ubiquitous virtue in conversation: "Why is dependency injection good?" "Loose coupling!" "Why is Model-View-Controller good?" "Loose coupling!" "Where do you want to go for lunch?" "Loose coupling!" Meanwhile, cohesion is a largely forgotten virtue.
Cohesion is the degree to which a software module's behaviors form a meaningful whole. Higher internal cohesion is good: the things you need to perform a task are localized. In mainstream programming languages, classes and namespaces should be highly cohesive.
Loose coupling lives in a certain tension with cohesion. Cohesion is great for the task at hand, but loose coupling, the degree to which dependencies between code modules is minimized, is important for extension. Cohesion is a way to judge the surface area of a module; low coupling is something you should look for in the way your module's surface interacts with others. As you go up and down in abstraction, you have to shift your level of attention: When you're working within a module, you don't want to repeat code or behavior for the sake of avoiding a function call or a useful abstraction.
At the higher level of modules, I see too much code that is overworked in service to some mythical day when the underlying implementation will be "swapped out." It's a virtuous ideal, but when you start digging into the module structure, you find that actually getting a computation done requires five different modules. One's hands are tied when it comes to "swapping out" just one of the implementations because of the lack of cohesion. If you have some rework goal, such as parallelization or performance, it turns out that the dependencies between the modules are "loose" only in the trivial sense that they are determined at runtime.
When working at the level of namespaces or libraries, I think of cohesion in terms of tabs: How many tabs do I have to have open to comprehend what's going on in the code or what my options are? One of object orientation's great benefits is that it provides a model that promotes "informational cohesion": a set of functions on a single data structure. One naturally expects the function for string length in the String class, and scanning a class' methods should give a sense of operations on that data.