Malicious agents can crash a website by implementing a DDoS—a Distributed Denial of Service Attack—against a server. So can sloppy programmers.
Take, for example, the National Weather Service’s website, which is operated by the United States National Oceanic and Atmospheric Administration, or NOAA. On August 29, the service went down, hard, as single rogue Android app overwhelmed the NOAA’s servers.
As far as anyone knows, there was nothing deliberately malicious about the Android app, and of course there is nothing specific to Android in this situation. However, the app in question was making service requests of the NOAA server’s public APIs every few milliseconds. With hundreds, thousands or tens of thousands of instances of that app running simultaneously, the NOAA system collapsed.
There is plenty of blame to go around. Let’s start with the app developer.
Certainly the app developer was sloppy, sloppy, sloppy. I can imagine that the app worked great in testing, when only one or two instances of the app were running at any one time on a simulator or on actual devices. Scale it up—boom! This is a case where manual code reviews may have found the problem. Maybe not.
Alternatively, the app developer could have checked to see if the public APIs it required (such as NOAA’s weather API) could handle the anticipated load. However, if the coders didn’t write the software correctly, load testing may not have sufficed. For example, say that the design of the app was to pull data every 10 seconds. If the programmers accidentally set up the data retrieval to pull the data every 10 milliseconds, the load would be 1,000x greater than anticipated. Every 10 seconds, no problem. Every 10 milliseconds, big problem. Boom!
This is a nasty bug, to be sure. Compilers, libraries, test systems, all would verify that the software ran correctly, because it did run correctly. In the scenario I’ve painted, it simply wasn’t coded to meet the design. The bug might have been spotted if someone noticed a very high number of external API calls, or again, perhaps during a manual code review. Otherwise, it’s not hard to see how it would slip through the crack.
Let’s talk about NOAA now. In 2004, the weather service beefed up its Internet loads in anticipation of Hurricane Charley, contracting with Akamai to host some of its busiest Web pages, using distributed edge caching to reduce the load. This worked well, and Akamai continued to work with NOAA. It’s unclear if Akamai also fronted public API calls; my guess is that those were passed straight through to the National Weather Service servers.