AI and machine learning (ML) are finding their way into more applications and use cases. The software testing vendors are increasingly offering “autonomous” capabilities to help customers become yet more efficient. Those capabilities are especially important for Agile and DevOps teams that need to deliver quality at speed. However, autonomous testing capabilities are relatively new, so they’re not perfect or uniformly capable in all areas. Also, the “autonomous” designation does not mean the tools are in fact fully autonomous, they’re merely assistive.

“Currently, AI/ML works great for testing server-side glitches and, if implemented correctly, it can greatly enhance the accuracy and quantity of testing over time,” said Nate Nead, CEO of custom software development services company “Unfortunately, where AI/ML currently fails is in connecting to the full stack, including UX/UI interfaces with database testing. While that is improving, humans are still best at telling a DevOps engineer what looks best, performs best and feels best.”

What to look for in a web and mobile test automation tool
Continuous testing isn’t optional anymore
Forrester’s recommendations for building a successful continuous testing capability has tried solutions from and BMC, and attempted some custom internal processes, but the true “intelligence” is not where imaginations might lead yet, Nead said.

It’s early days
Gartner Senior Director Analyst Thomas Murphy said autonomous testing is “still on the left-hand side of the Gartner Hype cycle.” (That’s the early adopter stage characterized by inflated expectations.)

The good news is there are lots of places to go for help including industry research firms, consulting firms, and vendors’ services teams. Forrester VP and Principal Analyst Diego Lo Giudice created a five-level maturity model inspired by SAE International’s “Levels of Driving Automation” model. Level 5 (the most advanced level) of Lo Giudice’s model, explained in a report, is fully autonomous, but that won’t be possible anytime soon, he said. Levels one through four represent increasing levels of human augmentation, from minimal to maximum. 

The most recent Gartner Magic Quadrant for Software Test Automation included a section about emerging autonomous testing tools. The topic will be covered more in the future, Murphy said.

“We feel at this point in time that the current market is relatively mature, so we’ve retired that Magic Quadrant and our intent is to start writing more about autonomous capabilities and potentially launch a new market next year,” said Murphy. “But first, we’re trying to get the pieces down to talk about the space and how it works.”

Forrester’s Lo Giudice said AI was included in most of the criteria covered in this year’s Continuous Functional Test Automation Wave.

“There was always the question of, tell me if you’re using AI, what for and what are the benefits,” said Lo Giudice. “Most of the tools in the Wave are using AI, machine learning and automation at varying levels of degree, so it’s becoming mainstream of who’s using AI and machine learning.”

How AI and ML are being used in testing
AI and ML are available for use at different points in the SDLC and for different types of testing. The most popular and mature area is UI testing. 

“Applitools allows you to create a baseline of how tolerant you want to be on the differences. If something moved from the upper right-hand corner to the lower left-hand corner, is that a mistake or are you OK with accepting that as the tests should pass?” said Forrester’s Lo Giudice.  

There’s also log file analysis that can identify patterns and outliers. Gartner’s Murphy said some vendors are using log files and/or a web crawling technique to understand an application and how it’s used.

“I’ll look at the UI and just start exercising it and then figure out all the paths just like you used to have in the early days of web applications, so it’s just recursively building a map by talking through the applications,” said Murphy. “It’s useful when you have a very dynamic application that’s content-oriented [like] ecommerce catalogs, news and feeds.”

If the tool understands the most frequently used features of an application it may also be capable of comparing its findings with the tests that have been run.

“What’s the intersection between the use of the features and the test case that you’ve generated? If that intersection is empty, then you have a concern,” said Forrester’s Lo Giudice. “Am I designing and automating tests for the right features? If there’s a change in that space I want to create tests for those applications. This is an optimization strategy, starting from production.”

Natural language processing (NLP) is another AI technique that’s used in some of the testing tools, albeit to bring autonomous testing capabilities to less technical testers. For example, the Gherkin domain specific language (DSL) for Cucumber has a relatively simple syntax :”Given, When, Then,” but natural language is even easier to use.

“There’s a [free and open source] tool called Gauge created by ThoughtWorks [that] combines NLP together with the concept of BDD so now we can start to say you can write requirements using a relatively normal language and from that the tool can figure out what tests you need, when you met the requirement,” said Gartner’s Murphy. “[T]hen, they connect that up to a couple of different tools that create those [tests] for you and run them.”

Parasoft uses AI to simplify API testing by allowing a user to run the record-and-play tool and from that it generates APIs.

“It would tell you which APIs you need to test if you want to go beyond the UI,” said Forrester’s Lo Giudice. 

Some tools claim to be “self-healing,” such as noticing that a path changed based on a UI change. Instead of making the entire test fail, the tool may recognize that although a field moved, the URL is the same and that the test should pass instead of fail.

“Very often when you’re doing Selenium tests you get a bug, [but] you don’t know whether it’s a real bug of the UI or if it’s just the test that fails because of the locator,” said Lo Giudice. “AI and machine learning can help them get over those sorts of things.”

AI and ML can also be used to identify similar tests that have been created over time so the unnecessary tests can be eliminated. uses AI and ML to find and fix runtime errors faster.

“The speed improvements of AI/ML allow for runtime errors to be navigated more quickly, typically by binding and rebinding elements in real time, and moving on to later errors that may surface in a particular batch of code,” said’s Nead. “Currently, the machine augmentation typically occurs in the binding of the elements, real-time alerts and restarts of testing tools without typically long lags between test runtime.”

Do autonomous testing tools require special skills?
The target audience for autonomous software testing products are technical testers, business testers and developers, generally speaking. While it’s never a bad idea to understand the basics of AI and ML, one does not have to be a data scientist to use the products because the vendor is responsible for ensuring the ongoing accuracy of the algorithms and models used in their products. 

“In most cases, you’re not writing the algorithm, you’re just utilizing it. Being able to understand where it might go wrong and what the strengths or weaknesses of that style are can be useful. It’s not like you have to learn to write in Python,” said Gartner’s Murphy.’s Nead said his QA testing leads and DevOps managers are the ones using autonomous testing tools and that the use of the tools differs based on the role and the project in which the person is engaged.

If you want to build your own autonomous testing capabilities, then data scientists and testers should work together. For example, Capgemini explained in a webinar with Forrester that it had developed an ML model for optimizing Dell server testing. Before Dell introduces a new server, it tests all the possible hardware and software configurations, which exceed over one trillion tests.

“They said the 1.3 trillion possible test cases would take a year to test, so they sat down with smart testers and built a machine learning model that looked at the most frequent business configurations used in the last 3, 4, 5 years,” said Forrester’s Lo Giudice. “They used that data and basically leveraging that data, they identified the test cases they had to test for maximum coverage with a machine learning model that tells you this is the minimum number of test cases [you need to run].”

Instead of needing a year to run 1.3 trillion tests, they were able to run a subset of tests in 15 days. 

The Dell example and the use cases outlined above show that autonomous testing can save time and money.

“Speed comes in two ways.  One is how quickly can I create tests? The other is how quickly can I maintain those tests?” said Gartner’s Murphy. “One of the issues people run into when they build automation is that they get swamped with maintenance. I’ve created tons of tests and now how do I run them in the amount of time I have to run them?”

For example, if a DevOps organization completes three builds per hour but testing a build takes an hour, the choices are to wait for the tests to run in sequence or run them in parallel.

“One of the things in CI is don’t break the build. If you start one build, you shouldn’t start another build until you know you have a good build, so if the tests [for three builds] are running [in parallel] I’m breaking the way DevOps works. If we’ve got to wait, then people are laying around before they can test their changes. So if you can say based on the changes you need, you don’t need to run 10,000 tests, just run these 500, that means I can get through a build much faster,” said Murphy.

Similarly, it may be that only 20 tests need to be created instead of 100. Creating fewer tests takes less time and a smaller number of tests takes less time to automate and execute. The savings also extend out to cloud resource usage and testing services.

“The more you can shift the use of AI to the left, the greater your benefits will be,” said Forrester’s Lo Giudice. 

The use of AI and ML in testing is relatively new, with a lot of progress being made in the last 12 to 18 months. However, there is always room for improvement, expansion and innovation.

Perhaps the biggest limitation has to do with the tools themselves. While there’s a tendency to think of AI in general terms, there is no general AI one can apply to everything. Instead, the most successful applications of AI and ML are narrow, since artificial narrow intelligence (ANI) is the state of the art. So, no one tool will handle all types of tests on code regardless of how it was built.

“It’s not just the fact that it’s web or not. It’s this tool works on these frameworks or it works for Node.js but it doesn’t work for the website you built in Java, so we’re focused on JavaScript or PHP or Python,” said Gartner’s Murphy. “Worksoft is focused on traditional legacy things, but the way the tool works, I couldn’t just drop it in and test a generic website.”’s Nead considers a human in the loop a limitation.

“Fixes still require an understanding of the underlying code, [because one needs to] react and make notes when errors appear. The biggest boons to testing are the speed improvements offered over existing systems. It may not be huge yet as much of the testing still requires restarting and review from a DevOps engineer, but taken in the aggregate, the savings do go up over time,” said Nead.

Autonomous testing will continue to become more commonplace because it helps testers do a better job of testing faster and cheaper than they have done in the past. The best way to understand how the tools can help is to experiment with them to determine how they fit with existing processes and technologies.

Over time, some teams may find themselves adopting autonomous testing solutions by default, because their favorite tools have simply evolved.