Organizations continue to modernize their software development and delivery practices to minimize the impact of business disruption and stay competitive. Many of them have adopted continuous integration (CI) and continuous delivery (CD), but continuous testing (CT) tends to be missing. When CT is absent, software delivery speed and code quality suffer. In short, CT is the missing link required to achieve an end-to-end continuous process.

“Most companies say they are Agile or want to be Agile, so they’re doing development at the speed of whatever Agile practice they’re using, but QA [gets] in the way,” Manish Mathuria, CTO and founder of digital strategy and services company Infostretch. “If you want to test within the sprint’s boundary, certifying it and managing the business around it, continuous testing is the only way to go.”

Part of the problem has to do with the process-related designations the software industry has chosen, according to Nancy Kastl, executive director of testing at digital transformation agency SPR.

“DevOps should have been called DevTestOps [and] CI/CD should have been called CI/CT/CD,” said Kastl. “In order to achieve that accelerated theme, you need to do continuous testing in parallel with coding. Otherwise, you’re going to be deploying poor quality software faster.”

Although many types of testing have shifted left, CT has not yet become an integral part of an end-to-end, continuous process yet. When CT is added to CI and CD, companies see another step-change improvement in speed and quality.

Automation is Key

Test automation is necessary for CT; however, the two are not synonymous. In fact, organizations should update their automated testing strategy to accommodate the end-to-end nature of continuous processes.

“Automated testing is a part of the CI/CD pipeline,” said Vishnu Nallani Chekravarthula, VP and head of Innovation at software development and quality assurance consultancy Qentelli. “All the quality check gates should be automated to ensure there is no manual intervention for code promotion between [the] different stages.”

That’s not to say that manual testing is dead. The critical question is whether manual testing or automated testing is more efficient based on the situation.

“You have to do a cost/benefit analysis [because] the automation has to pay for itself versus ongoing manual execution of the script,” said Neil Price-Jones, president of software testing and quality assurance consultancy NVP Testing. “The best thing I heard was someone who said he could automate a test in twice the time it would take you to write it.”

CT is not about automating everything all the time because that’s impractical.

“Automated testing is often misunderstood as CT, but the key difference is that automated testing is not in-sprint most times [and] CT is always done in-sprint,” Qentelli’s Chekravarthula said.

Deciding what to automate and what not to automate depends on what’s critical and what’s not. Mush Honda, vice president of testing at KMS Technology, considers the business priority issues first.

“From a strategy perspective, you have to consider the ROI,” said Honda. “[During a] project I worked on, we realized that one area of the system was used only 2 percent of the total time, so a business decision was made that only three cases in that 2 percent can be the descriptors, so we automated those.”

In addition to doing a cost/benefit analysis, it’s also important to consider the risks, such as the business impact if a given functionality or capability were to fail.

“The whole philosophy between CI and CD is you want to have the code coming from multiple developers that will get integrated. You want to make sure that code works itself as well as its interdependencies,” said SPR’s Kastl.

Tests That Should Be Included

CT involves shifting many types of tests left, including integration, API, performance and security, in addition to unit tests. All of those tests also apply to microservices.

For modern technology architectures such as microservices, automated API testing and contracts testing are important to ensure that services are able to communicate with each other in a much quicker way than compared to integration tests,” said Chekravarthula. “At various stages in the CI/CD pipeline, different types of tests are executed to ensure the code passes the quality check gates.”

Importantly, CT occurs throughout the entire SDLC as orchestrated by a test strategy and tools that facilitate the progression of code.

“The tests executed should ensure that the features that are part of a particular commit/release are covered,” said Chekravarthula. “Apart from the in-sprint tests, regression tests and non-functional tests should be executed to ensure that the application that goes into production does not cause failures. At each stage of the CI/CD pipeline, the quality check gates should also reach a specified pass percentage threshold for the build to qualify for promotion to the next stage.”

Since every nuance of a microservices application is decomposed into a microservice, each can be tested at the unit level and certified there and then integration tests are created above them.

“Functional, integration, end-to-end, security and performance tests [are] incorporated in your continuous testing plan,” said Infostretch’s Mathuria. “In addition, if your microservices-based architecture is to be deployed in a public or private cloud, that brings different or unique nuances of testing for identity management, testing for security. Your static testing could be different, and your deployment testing is definitely different so those are the new aspects that microservices-based architecture has to reflect.”

KMS Technology’s Honda breaks testing into two categories — pre-production and production. In pre-production, he emphasizes contracts in testing, which includes looking at all the API documentation for the various APIs, verifying that the APIs perform as advertised, and checking to see if there is any linguistic contract behavior within the documentation or otherwise that the microservices leverage. He also considers the different active end points of all the APIs, including their security aspects and how they perform.

Then, he considers how to get the testing processes and the automated test scripts in the build and deployment processes facilitated by the tool he’s using. Other pre-production tests include how an API behaves when the data has mutated and how APIs behave from a performance perspective alone and when interacting with other APIs. He also does considerable component integration testing.

“I call out pre-prod and prod buckets specifically instead of calling out dev, QA, staging, UAT, all of those because of the nature of continuous delivery and deployment,” said Honda. “It’s so much simpler to keep them at these two high levels.”

His test automation emphasizes business-critical workflows including what would be chaotic or catastrophic if those functions didn’t work well in the application (which ties in with performance and load testing).

“A lot of times what happens is performance and load are kept almost in their own silo, even with microservices and things of that nature. I would encourage the inclusion of load and performance testing to be done just like you do functional validation,” said Honda. “Have [those tests] as a core [element] along with security to make sure that all of the components within the microservice itself are capable and functioning as expected.”

On the production side, he stressed the need for testers to have access to data such as for monitoring and profiling.

“With the shift in the role that testing plays, testers and the team in general should get access to that information [because it] allows us to align with the true business cases that are being applied to the application when its live,” said Honda. “How or what areas of the system are being hit based on a particular event in the particular domain or socially? [That] also gives us the ability to get a good understanding of the user profile and what a typically useful style of architecture is. Using all of that data is really good for test scenarios and [for setting] our test strategy.”

He also recommends exploratory testing in production such as A/B testing and canary deployments to ensure the code is stable.

“The key thing that makes all of this testing possible is definitely test data,” said Honda. “When you test with APIs, the three key sorts of test data that I keep in mind are how do you successfully leverage stubs when there’s almost canned responses coming back with more logic necessarily built into those? How do you leverage fakes whereby you are simulating APIs and leveraging anything that’s exposed by the owner of any dependent services? Also, creating mocks and virtualization where we need to make sure that any of the mocks are invoked in a certain manner [which allows] you to focus on component interaction between services or microservices.”

In addition to API testing, some testing can be done via APIs, which SPR’s Kastl leverages.

“If I test a service that is there to add a new customer into a database, I can test that the new customer can be added to the service without having a UI screen to collect all the data and add that customer,” said Kastl. “The API services-level testing winds up being much more efficient and with automation, it’s a lot more stable [because] the API services are not going to change as quickly as your UI is changing.”

Jim Scheibmeir, associate principal analyst at Gartner, underscored the need for authentication and entitlements.

“[Those are] really important when we talk about services: who can get to the data, how much they can get to, is there a threshold to usage or availability,” said Scheibmeir. “Performance is also key, and because this is a complex environment, I want to test integrations, even degrading them, to understand how my composite application works when dependencies are down, or how fast my self-healing infrastructure really is.”

Ensuring the Right Combination of Tests

One obvious test fail is the one that wasn’t foreseen by whomever is testing the code. When deciding which tests to run, KMS Technology’s Honda considers three things: business priority, the data, and test execution time.

Business priority triages test cases based on the business objective. Honda recommends getting business buy-in and developer buy-in when explaining which tests will be included as part of the automation effort, since the business may have thought of something the testers didn’t consider.

Second, the data collected by monitoring tools and other capabilities of the operations team provide insight into application usage, the target, production defects, how the production defects are being being classified, how critical they are from a business severity perspective, and the typical workflows that are being used. He also pays attention to execution speed since dependencies can result in unwanted delays.

“There should be some agreed-upon execution SLA and some agreed-upon independent factor that says our test cases are not necessarily depending on one another. It increases the concept of good scalability as your system matures,” said Honda. “One of the routine mistakes I’ve seen a lot of teams make is saying I’m going to execute six test cases, all of which are interdependent. [For example,] I can’t execute script number three unless tests one and two have been executed. [Test independence] becomes critical as your system matures and you have to execute 500 or 50,000 test cases in the same amount of time.”

Given that change is a constant, test suites and test strategies need should be revisited over time to ensure they’re still relevant.

“It’s always a good idea to constantly look at what has been automated and what outcomes have been achieved,” said Honda. “Most of the time, people realize that it’s an opportunity to trim the fat [when] you get the tests that are no longer relevant.”

As his teams writes tests, they include metadata such as the importance of the test, its priority, the associated user story, the associated release, and what specific modules, component, or code file the test is executed against. That way, using the metadata, it is possible to choose tests based on goals.

“The type of test and the test’s purpose both have to be incorporated in your decision-making about what tests to run,” said Honda. “You don’t write your tests thinking, ‘I want to test this,’ you write your test and incorporate a lot of metadata in it, then you can use the tests for a specific goal, like putting quality into your CD process or doing a sanity test so you can promote it to the next stage. Then you can mix and match the metadata to arrive at the right suite of tests at runtime without hardcoding that suite.”

Performance and security testing should be an integral part of CT given user experience expectations and the growing need for code-related risk management.

“What goes into the right mix of CT tests are those that are going to test the functionality you’re about ready to deploy, as well as critical business functionality that could be impacted by your change plus some performance and security tests,” said SPR’s Kastl. “Taking a risk-based approach is what we do normally in testing and it’s even more important when it comes to continuous testing.”

Past application issues are also an indicator of which tests should be prioritized. While test coverage is always important, it’s not just about test coverage percentages per se. Acceptable test coverage depends on risk, value, cost effectiveness and the impact on a problem-prone area.

Gartner’s Scheibmeir said another way to ensure the right mix of tests is being used is to benchmark against oneself over time, measuring such things as lead time, Net Promoter Score, or the business value of software and releases.