Testing is risky business

Published: January 29th, 2014

Any discussion of risk-based software testing invariably leads to HealthCare.gov, the healthcare exchange website that’s the most prominent face of Obamacare. The site is almost certainly the most obvious software-related failure in recent memory. Its botched rollout last fall generated cringe-worthy stories that continued into 2014.

It doesn’t take any sort of software genius to declare HealthCare.gov a total flop, at least at launch. Talk to those who advocate risk-based testing, though, and you’re likely to hear the same two conclusions about the coding disaster that helped to knock President Obama’s approval rating down to the lowest point in his presidency. First, never think of the cost of a faulty subroutine just in terms of the technical headache it will create. Second, always insist on honest discussions about risk right up the management chain. And by the way, expect both bits of wisdom to be routinely subverted due to fellow programmers, higher-ups and broader trends in technology, including a vastly accelerated schedule of code drops.

“There’s no one in the world today that’s developing product who has enough time or other resources to do enough quality assurance,” says Bryce Day, founder and CEO of Catch Software, based in New Zealand.

Focus first on what matters most
Given these constraints, it follows that testers need to make a prioritized list of potential bugs, based on risk. This indeed is the hallmark of risk-based testing. But how do you generate this list? The most common way is to do some sort of mathematical analysis where risk is calculated as the frequency of a given operation multiplied by the impact if that operation fails. (Day elaborated on this definition in a 2012 blog post, “Risk: A four-letter word for quality management?”)

Sounds easy enough, but Day said many software types make a mistake when they apply this sort of thinking just to their blocks of code. Risk-based testing is not about thinking of a path through a piece of code, he insisted, but rather about the path through your business.

To illustrate, Day used the example of software for an ATM. The frequency of users entering their PINs is high, probably nearly 100%. And given that ATMs are basically unusable if an entered PIN isn’t recognized, that means the risk associated with a faulty PIN process is quite high as well. So if you’re writing code, make sure the process of entering PINs is tested six ways to Sunday.

Day’s point is that if you focus on your overall business operation instead of just the workings of your application, it’s easier to arrive at reasonable guesses as to the impact of potential problems. ATMs that botch handling of PINs cost their owners lots of money. ATMs that get buggy when users try to switch to the Portuguese or Polish interface cost their owners less. Neither outcome is good. But obviously testing should focus on any problems with PINs first.

Like many who preach the merits of risk-based testing, Day said doing some basic calculating of expected costs is a good starting point. However, what’s more essential is insisting on open, direct conversations about risk that go right up the management chain. It’s relatively easy for front-line developers to dial up or down their testing efforts based on the consensus about the appropriate level of risk. The hard part is reaching that consensus in the first place.

One reason agreement is difficult is that feelings about risk naturally differ depending where you sit in your organization’s hierarchy. Most individual developers and their immediate supervisors are highly risk-averse. Makes sense, given the closer you are to the technology, the better you understand how many ways things can go wrong. Move up the chain, though, and you will soon get to management types motivated, at least in the private sector, to shrink time to market and boost sales. And in the public sphere there are the related concerns of gaining advantage in public perception and endless political jockeying.

“In government… the currency by which we measure return on investment is politics,” wrote former NASA CIO Linda Cureton in a Jan. 6 Information Week commentary.

A lack of understanding
Another problem is that understanding of risk and complexity among business and software engineering types alike hasn’t necessarily kept up with advances in software generally. The increase of pattern-based languages and frameworks—think Ruby and Ruby on Rails as just one example—makes it easier for relative novices to quickly build Web apps. And the stories of those youthful self-taught programmers who hit it big get set in tech lore like insects in amber.

Self-taught college dropout Jack Dorsey, who claimed programming to be “an art form” and famously took drawing and fashion-design classes even while serving as Twitter CEO, became a billionaire with Twitter’s 2013 IPO. It’s reasonable to ask: If this would-be dressmaker can hit it this big, how hard can anything to do with programming—including testing—be?

Programmers advocating a rigorous approach are under siege from within their own ranks, too. Day pointed his finger at the most zealous advocates of agile programming as the biggest culprits. Like all zealots of any persuasion, those advocating iterative and incremental development see their way as the answer to nearly any question. He said the agile attitude seems to be “Don’t worry about bugs in the release; we can fix any that sneak through in two weeks, tops.”

Unfortunately, two weeks may be too much time, and not just for high-profile projects like HealthCare.gov or Windows 8.1, which suffered its own rocky rollout last fall. A series of blog posts on Appurify, a Google Ventures-backed startup in San Francisco working on mobile test automation, described how buggy code can sink user ratings, and thus discoverability and rankings, on the Apple App Store. This is relevant even for popular, well-established apps since, as Appurify’s chief data scientist Krishna Ramamurthi wrote in one post, “The ratings for ‘Current Version’ are featured more prominently on search engine results, and arguably matter more to the average consumer than ratings for ‘All Versions.’ ”

Nischal Varun is director of testing services at IT services firm Mindtree, which earns about a third of its US$430 million in annual revenue from testing. He agreed with most of what Day said about software testing trends, though he said there’s another factor worth considering: increased government regulation, particularly in healthcare.

In September, the FDA issued new guidelines for its oversight of medical apps. Apps subject to regulation—those intended to be used as an accessory to a regulated medical device, or to transform a mobile platform into a regulated medical device—now have to navigate the FDA’s own tailored risk-based approach.

The future, Varun believes, is to move away from having to assess risk from scratch for each project. Mindtree, which focuses on business verticals such as insurance (including regulatory risk and compliance), is exploring how it might offer customers prepackaged risk frameworks, tailored to specific industries or types of applications. An example: Mindtree might approach a bank and its software vendor and say “Look, here are the typical 100 requirements for this type of project, and here are the typical risks associated with each requirement.”

Keep reading up on risk and you’ll eventually stumble on a basic truism, one that applies far beyond the world of software. Human beings are generally lousy at assessing and pricing risks, especially those risks that are very unlikely though catastrophic if they do happen.

Here’s an example. Pick one game to play; you have five seconds to choose. Your first choice: Flip a coin and give me $100 if it comes up heads. Your second choice: Shuffle a deck of cards, flip the top one over, and give me $3,000 if it comes up queen of hearts.

Most people instinctively think something like “well, it’s 50-50 that I’ll be out $100 if I flip a coin and there’s almost no chance that I’ll pick the queen of hearts at random, so give me the deck of cards.” In fact, the card game is the more expensive option. Your expected cost in playing is nearly $58, or $3,000 divided by 52, the number of cards in the deck; the cost of the coin toss is just $50.

There are loads of examples of this in the wider tech world, where systems get much more complicated than a random deck of playing cards. And because we’re so bad a pricing risks, James Bach looks askance at most approaches to risk-based testing that come with an ostensible veneer of mathematical rigor.

A more moral testing regime
Any discussion of risk-based software testing eventually leads to Bach, whose 1999 article “Heuristic Risk-Based Testing” still appears near the top of any Google query on the topic. Bach, himself a high school dropout, has the kind of career that’s only possible in the world that software helped to create. He travels the world teaching and consulting on software testing, though when he’s home on Orcas Island, Wash., he’ll offer free coaching for anyone who contacts him via Skype. (His user name: satisfice.)

Bach doesn’t hold back when asked about the usefulness of doing basic calculation of risk when building a test plan: “A lot of people do risk-based testing by coming up with [fake] numbers and multiplying them together in [stupid] ways. That’s not mathematical. It is ritualistic, unscientific and unnecessary.”

He bases his complaint on the fact that assigning actual values of either likelihood or cost of a given bug is astonishingly difficult, which causes most teams to use arbitrary and highly subjective scales for each. One common example is using a 1-10 scale to rank both probability and impact of a list of possible bugs. Multiplying the two numbers together gives a prioritized list, but given the garbage-in/garbage-out nature of the data, the ordering is invariably inaccurate or even meaningless.

Instead of faux calculations, Bach urged his clients to interrogate their code by just thinking through all the ways things might go wrong, more like a prosecutor than a programmer. There are many methodologies to pick from, but basically it comes down to systematically working through a series of open-ended questions, much as you would with your teenage daughter who wants to go to a dance with a boy you’ve never met. And whether it’s the prom or release date that’s looming, clear-eyed honesty may be the most important criteria of any test methodology.

Speaking of honesty, it brings us full circle to HealthCare.gov, a topic about which Bach, unsurprisingly, has strong opinions. He expressed many of these in a Nov. 13 blog post, “Healthcare.gov and the tyranny of the innocents,” that lambasts everyone in the management chain who was both clueless about code and unwilling to listen to complaints from their more technical subordinates.

However, Bach doesn’t spare these front-line technologists either, especially as it’s come out that the testing that was done before the site’s launch confirmed that it was nowhere near ready: “Why didn’t you go public? Why didn’t you resign? You like money that much? Your integrity matters that little to you?” he wrote.

Is risk-based testing really as much about morals as methodology? Maybe, though even Bach, who seems to have built his reputation on radical candor, admitted that it’s sometimes hard to do the right thing. And these lapses come with a cost of their own, one that that may be hardest of all to put a price tag on.

“I don’t always live up to my highest ideals for my own behavior,” he said, “and when that happens, I feel shame.”

G. Arnold Koch is a writer in Portland, Ore. His last article, “The release management tug of war,” appeared on SDTimes.com in August 2013.

Article Tags

testing

About G. Arnold Koch

View all posts by G. Arnold Koch

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Testing is risky business

Article Tags

Subscribe to SDTimes

About G. Arnold Koch

Related Articles

Snyk announces new DAST solution for securing APIs and web apps

5 common assumptions in load testing—and why you should rethink them

BrowserStack adds Private Devices offering to enabling testing across variety of secured devices

3 ways test impact analysis optimizes testing in Agile sprints