It’s April 15, 1996. The Internal Revenue Service (the tax-collecting arm of the United States government) is trying to turn around a huge software disaster. The agency has spent more than US$3.4 billion on the so-called Tax System Modernization project. TSM, however, was an abject failure.
That’s not the first large systems failure, and it won’t be the last. See IEEE Spectrum’s “Why Software Fails.” The article, by Robert Charette, describes 31 large IT project failures from 1992 through 2005. There are lots more, of course.
What brings this up is the epic fail of the U.S. government’s new website intended to help consumers sign up for healthcare exchanges under the Affordable Care Act.
Given the state of what’s going on in America, the HealthCare.gov site’s collapse—timeouts, bad query results and worse—rapidly became politicized. Let’s stay away from that and focus on what we know on the technical side, which is that the site appears to be unable to scale. Is it hardware failure? Software bugs? Insufficient bandwidth? Misconfiguration of the app server? Bad SQL queries? Perhaps all of the above.
We don’t even know if the site’s failures are deep-rooted in the application architecture, or might be resolved by throwing a couple of DDR3 RAM modules into some server in the White House basement.
Okay, it’s probably not that simple. However, this does point to five endemic weaknesses to huge projects like this, particularly those that are run by government agencies (of any government).
1. It is difficult to create complex software with complex requirements. With the number of entities wanting to exert their influence on the Affordable Care Act, the complexity of the law itself, and the vast number of potential users of the software, the requirements for the HealthCare.gov site must be incredibly convoluted. How can anyone validate that the requirements were appropriate, and then test that the systems complied with those requirements? Clearly, that was not possible.
2. It’s nearly impossible to go from zero to 1.0 in one shot. Due to the nature of laws and deadlines, there was no option to create limited functionality, or to do a partial rollout. Like Athena from the head of Zeus, the site must appear fully formed all at once. Yeah. Good luck with that.
3. It’s hard to write big software while under public scrutiny. Politicians, lawyers and the media are scrutinizing e-mails between government officials and the contractors, looking to point fingers. That’s the culture of government projects, and everyone knows it. That means that during the project’s design and implementation phases, everyone was cagy, and presumably most were careful about what they were saying. Open and honest communications are impossible in that world. Without open and honest communications, success is impossible.
4. If you can’t do large-scale betas, your big software is doomed. Huge corporations, like Apple or Microsoft, spent a lot of time doing betas of their biggest software releases, and even then, they were embarrassed by software failures (think Apple Maps) and defects. Given the nature of the HealthCare.gov project, a big open beta is out of the question. How could anyone expect success?
5. Just because you set an ambitious deadline doesn’t mean you can meet it. The HealthCare.gov site needed lots more development and testing—perhaps months more testing. With fixed deadlines, that didn’t happen. Brooks’s Law says adding more developers to a late project makes it later. What if the project’s deadlines are inflexible? There’s no way out. The only winners, alas, are the politicians.
Alan Zeichick, founding editor of SD Times, is principal analyst of Camden Associates.