AI regulations are coming: Here’s how to build and implement the best strategy

Published: August 15th, 2024

In April 2024, the National Institute of Standards and Technology released a draft publication aimed to provide guidance around secure software development practices for generative AI systems. In light of these requirements, software development teams should begin implementing a robust testing strategy to ensure they adhere to these new guidelines.

Testing is a cornerstone of AI-driven development as it validates the integrity, reliability, and soundness of AI-based tools. It also safeguards against security risks and ensures high-quality and optimal performance.

Testing is particularly important within AI because the system under test is far less transparent than a coded or constructed algorithm. AI has new failure modes and failure types, such as tone of voice, implicit biases, inaccurate or misleading responses, regulatory failures, and more. Even after completing development, dev teams may not be able to confidently assess the reliability of the system under different conditions. Because of this uncertainty, quality assurance (QA) professionals must step up and become true quality advocates. This designation means not simply adhering to a strict set of requirements, but exploring to determine edge cases, participating in red teaming to try to force the app to provide improper responses, and exposing undetected biases and failure modes in the system. Thorough and inquisitive testing is the caretaker of well-implemented AI initiatives.

Some AI providers, such as Microsoft, require test reports to provide legal protections against copyright infringement. The regulation of safe and confident AI uses these reports as core assets, and they make frequent appearances in both the October 2023 Executive Order by U.S. President Joe Biden on safe and trustworthy AI and the EU AI Act. Thorough testing of AI systems is no longer only a recommendation to ensure a smooth and consistent user experience, it is a responsibility.

What Makes a Good Testing Strategy?

There are several key elements that should be included in any testing strategy:

Risk assessment – Software development teams must first assess any potential risks associated with their AI system. This process includes considering how users interact with a system’s functionality, and the severity and likelihood of failures. AI introduces a new set of risks that need to be addressed. These risks include legal risks (agents making erroneous recommendations on behalf of the company), complex-quality risks (dealing with nondeterministic systems, implicit biases, pseudorandom results, etc.), performance risks (AI is computationally intense and cloud AI endpoints have limitations), operational and cost risks (measuring the cost of running your AI system), novel security risks (prompt hijacking, context extraction, prompt injection, adversarial data attacks) and reputational risks.

An understanding of limitations – AI is only as good as the information it is given. Software development teams need to be aware of the boundaries of its learning capacity and novel failure modes unique to their AI, such as lack of logical reasoning, hallucinations, and information synthesis issues.

Education and training – As AI usage grows, ensuring teams are educated on its intricacies – including training methods, data science basics, generative AI, and classical AI – is essential for identifying potential issues, understanding the system’s behavior, and to gain the most value from using AI.

Red team testing – Red team AI testing (red teaming) provides a structured effort that identifies vulnerabilities and flaws in an AI system. This style of testing often involves simulating real-world attacks and exercising techniques that persistent threat actors might use to uncover specific vulnerabilities and identify priorities for risk mitigation. This deliberate probing of an AI model is critical to testing the limits of its capabilities and ensuring an AI system is safe, secure, and ready to anticipate real-world scenarios. Red teaming reports are also becoming a mandatory standard of customers, similar to SOC 2 for AI.

Continuous reviews – AI systems evolve and so should testing strategies. Organizations must regularly review and update their testing approaches to adapt to new developments and requirements in AI technology as well as emerging threats.

Documentation and compliance – Software development teams must ensure that all testing procedures and results are well documented for compliance and auditing purposes, such as aligning with the new Executive Order requirements.

Transparency and communication – It is important to be transparent about AI’s capabilities, its reliability, and its limitations with stakeholders and users.

While these considerations are key in developing robust AI testing strategies that align with evolving regulatory standards, it’s important to remember that as AI technology evolves, our approaches to testing and QA must evolve as well.

Improved Testing, Improved AI

AI will only become bigger, better, and more widely adopted across software development in the coming years. As a result, more rigorous testing will be needed to address the changing risks and challenges that will come along with more advanced systems and data sets. Testing will continue to serve as a critical safeguard to ensure that AI tools are reliable, accurate and responsible for public use.

Software development teams must develop robust testing strategies that not only meet regulatory standards, but also ensure AI technologies are responsible, trustworthy, and accessible.

With AI’s increased use across industries and technologies, and its role at the forefront of relevant federal standards and guidelines, in the U.S. and globally, this is the opportune time to develop transformative software solutions. The developer community should see itself as a central player in this effort, by developing efficient testing strategies and providing safe and secure user experience rooted in trust and reliability.

You may also like…

The impact of AI regulation on R&D

EU passes AI Act, a comprehensive risk-based approach to AI regulation

Article Tags

AI-driven development, NIST, red team testing

About David Colwell

David Colwell is VP, Artificial Intelligence and Machine Learning, at Tricentis

View all posts by David Colwell

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

AI regulations are coming: Here’s how to build and implement the best strategy

What Makes a Good Testing Strategy?

Improved Testing, Improved AI

Article Tags

Subscribe to SDTimes

About David Colwell

Related Articles

What NIST’s newly approved post-quantum algorithms mean for the future of cryptography

NIST approves three cryptographic algorithms capable of withstanding quantum computers

NIST GenAI program launches to study how to distinguish AI-generated and human-generated content

NIST publishes new draft framework for integrating supply chain security into CI/CD pipelines