Before testers try out new methods that may expose software to many risks, their companies are asking themselves a big question: to test or not to test in production?

As companies move to implement development processes such as DevOps or Continuous Integration and Delivery, testing in production can become an important piece of the equation.

To put it simply, testing in production (TiP) means performing various types of software tests in a production environment where it is live and accessible to the end user. It’s rare to find a test environment that completely replicates a production environment, so the scale is not the same and won’t put your software through the same variables that affect performance as “real life.”

So, as with most processes in software development, experts in the field suggest considering the risks, the rewards and best practices before you test in production.

What’s the point of TiP?
Despite the risks, there are several benefits, which is why a tester should consider testing in production. One major benefit is that testing in production allows you to see how your application works in the live environment in which it runs.

“It would be like testing in development and then saying, ‘Well it works in development, so we don’t have to test in QA,’” said Brad Stoner, senior performance engineer at test company Neotys.

Testing in production can become a necessity when a service becomes more complex, as it would become difficult to mimic a production environment and know how the service will perform or be utilized by users. The staging environment and the real environment are never the same thing. According to Stoner, some organizations might not have the budget to recreate this type of environment that closely resembles the production environment, which is why companies and consultants switch to testing in production.

Stoner also said that testing in production gives a tester confidence in the test results because “You are actually testing against the environment that will take users versus testing in the lower environment.”

How to approach the risks
The process of testing in production depends on the application itself and what is being testing. Several items need to be in check in order to mitigate risks, no matter the testing environment. Businesses could experience a loss of transactions or commingling of test data with production data, and experts agree that a major risk of testing in production is the business risk. A bad user experience, security issues or system crashes could all lead to a loss in profit or tarnish a brand.

Among the risks are:
• Exposing potential vulnerabilities to the public
• Loss of data
• Poor user experience that damages the reputation of the organization
• Relying on users to report defects and vulnerabilities. Many will simply leave the app and not use it again, rather than filing a bug report with the organization

“The best way to mitigate the risks is by taking the right approach and by thinking about the outcomes,” said Margo Visitacion, vice president and principal analyst at Forrester Research. “It’s not about building the features and not just did you deliver on the user end, but did you consider performance and data integrity and test processes?”

Neotys’ Stoner said that measuring an application’s performance in the production environment alone can be a risk, and before launching a test all consequences should be considered.

“There’s always a right way and wrong way to get started,” he said. “Taking what you are doing in lower environments, be it pre-production or staging, and trying to execute that in production can be problematic.”

Before figuring out how to get started, professional testers need to begin by asking this question: “What do we mean by production?” as Rex Black, president of RBCS, states, because there are various processes for multiple production environments.

“You might be testing in production in one production environment—usually a smaller-scale environment—prior to deploying to the (typically larger) other production environments,” he said. His clients also use this kind of testing in production for their beta tests.

Black paraphrased the cons of testing in production with a family-friendly line from “Body Heat,” said by Mickey Rourke’s arsonist character: “There are dozens of ways that testing in production can go wrong, and if you think of half of those ways in advance, you’re a genius.”

TiP in an agile environment
Testing has long belonged to testers and QA. With agile, that has changed: Developers are now often asked to perform more of the testing, and testing can occur on a weekly, daily or hourly basis. In an agile environment, companies could test on every change on code, so every time the software evolves, every change of code is tested, according to Alon Girmonsky, CEO of performance management company BlazeMeter.

Development in today’s agile world results in smaller, more frequent releases. According to Black, the smaller size of the release would mitigate some risk, but the increased frequency of releasing raises the risk of vulnerable code being put out live.

To help with testing in production, companies that adopt an agile workflow need to adopt an agile mindset, according to Forrester’s Visitation. Doing so will cut down on the possible risks of testing in production.

“You have to have the right product and the right attitude of product owners and Scrum masters, and a development team where they are really thinking about test-driven development,” said Visitation. “So they are thinking about what is going to be the functional performance of the features that are being delivered, what is the risk in security, elements or the features that are going to be delivered in the application as a whole, and really planning for that in order to be able to identify what needs to be tested and what needs to be automated.”

She said that in an agile environment, it’s important for an agile team to discuss and figure out what are acceptable risks and what are not, because in this environment, “You’re going to be accepting that, in order to get things done faster, you may not be getting everything in terms of functionality.”

Unintentional testing in production
Some say testing in production is akin to skipping testing altogether, in that pre-deployment testing will catch errors before they get to the application’s users. However, testing in production should not be done as a way to spend less money or do less testing.

“When that (latter) scenario does occur, it’s not referred to as testing in production,” said Rex Black, president of RBCS. “It’s just the very risky way software is put into production.”

Anytime software has been released without being tested, or it’s released thinking adequate testing was done, there is a chance that business-impacting incidents can occur. Black said that when this happens, and the company finds a bug in production, they were “unintentionally” testing in production.

Real-life examples he cited of problems occurring after release in software that wasn’t thoroughly tested are when T-Mobile had a loss or corruption of data, leading to a permanent loss of customer photos.

An even bigger issue was the memorable launch of HealthCare.gov, where serious security, reliability and performance problems occurred in production because the software was not tested thoroughly.

Tips for TiP
Load and performance testing bring along many challenges, and one of the biggest is the testing environment itself, because most companies do not have the budget to recreate an environment that reflects their production environment. Given the challenges, Brad Stoner, senior performance engineer at Neotys, and his colleagues (namely Tim Hinds, product marketing manager, and Henrik Rexed, performance testing specialist) have broken down some of the confusing components. They’ve given some tips that you should consider before testing in production.

Timing of tests: Stoner said studies show that there is a link between user experience and profit, depending on the application. Running a load test during business hours increases the chances of a bad user experience, so he recommends running tests on the production environment during times that impact fewer real users. This includes during the night (or non-business hours), after deploying a new release, and during maintenance hours. It’s a short window for the performance engineer, but these are the timeslots that will have a lower impact on users.

Know the challenges and environment: Never launch a load test on a production environment without knowing anything of the released application’s level of performance. Stoner considers this a big risk, a risk that could even get you fired. Testing environments can differ in ways like the amount of servers, types of network equipment, integration with third-party tools, or utilization of a content-delivery network and bandwidth by other users or applications. Even with testing the application in a single environment, they recommend considering several resources (in addition to the performance engineer, who will be ready to react should something impactful occur):
• Operations: They can unplug the current production environment for the test.
• Architect or technical leader of the project: The guys who look at logs to identify any potential issues.
• DBA: They manage stability, and will find “blocking points” on the database and will “replug” the production database.
• Project leader: The person who will define timeslots for load testing, and inform users of potential disruptions.

Monitoring: In large organizations, testers might not have access to the operations team’s monitoring tools, so getting visibility of what is going on in production is a challenge. Whether it is functional or performance monitoring, the Neotys group said you need monitoring to see what is happening on the servers or in the databases, and if there is a problem, you can figure out why. “If I didn’t have production monitoring, I would never test in production,” said Stoner. You could say that good monitoring is a requirement to test in production successfully.

Cause chaos on purpose: Causing chaos on purpose is a method to ensuring that production is working as expected, but Stoner says this method is something to be wary of. A “chaos monkey” is code that introduces failures on purpose or kills a Web server in production to cause problems in an environment. The point of it is, if you’re designing a highly resilient application, and the chaos monkey kills one of the tiers, you must be able to maintain levels of service.