No Result
View All Result
SUBMIT YOUR ARTICLES
  • Login
Sunday, May 10, 2026
TheAdviserMagazine.com
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal
No Result
View All Result
TheAdviserMagazine.com
No Result
View All Result
Home Market Research Market Analysis

Please Test Your AI Agents — Like, At All

by TheAdviserMagazine
1 month ago
in Market Analysis
Reading Time: 4 mins read
A A
Please Test Your AI Agents — Like, At All
Share on FacebookShare on TwitterShare on LInkedIn


Recently, there’s been some very public (and, frankly, very funny) AI agent and bot failures.

Like Chipotle’s assistant supporting codegen (since patched): “Stop spending money on Claude Code. Chipotle’s support bot is free” (r/ClaudeCode)

And in a surreal fashion, Washington state’s call-center hotline providing Spanish support by speaking English with a Spanish accent: “Washington state hotline callers hear AI voice with Spanish accent” (AP News)

Coinciding with this, other Forrester analysts and I have had a spate of calls where organizations have launched a new AI agent without testing them.

Put simply, please do not do this.

Please test your AI agents before launching them — some options on how to do this are below.

What do we mean by this?

At minimum: Test all your bot’s features (and use cases) yourself.

For any AI agent, or new feature you’re introducing to it, the minimum effort you should invest is to make sure someone has used it as an end user before this goes live.

This can be as simple as someone on the developer team or as involved as a dedicated testing group. But you need to make sure that someone has actively used your solution — and all its features. This should also be done on an ongoing basis so that when new features are launched, they’re tested, too.

This can be time-intensive, but as we see with the public cases, not everything works as expected all the time.

In fact, AI can go wrong in more unexpected ways than before. If you can’t ensure that features are working as intended, then you might end up on the news.

Please note that this is the minimum possible effort. This is not enough to ensure that something won’t go wrong or your application won’t fail — this will only catch the most obvious/embarrassing outcomes. A more robust testing practice is recommended.

For more on how agentic systems fail: Why AI Agents Fail (And How To Fix Them)

Recommended: Practice red teaming.

A good way to prevent this kind of unexpected permutation is with red teaming or intentionally trying to break the bot. We recommend this as a standard practice for your organization.

There are two sides to this: One is traditional or infosec red teaming. This is focused on finding security exploits. The second is behavioral. This is focused on getting the solution or model to behave in an inappropriate or unintended fashion. It is best to have a practice on both.

At the very least, your team should kick the tires for a day and try as many exploits as possible. Even when you have a governance layer, you must ensure that it’s holding up in the wild or, ideally, even post-launch.

For more on the red team practice: Use AI Red Teaming To Evaluate The Security Posture Of AI-Enabled Applications

For more on standard governance approaches that should be followed: Introducing Forrester’s AEGIS Framework: Agentic AI Enterprise Guardrails For Information Security

For specific common governance failures, see AIUC-1’s page, “The world’s first AI agent standard”

For a fun example of what employee-driven red teaming can look like, check out Anthropic’s write-up, “Project Vend: Can Claude run a small shop? (And why does that matter?)”

Recommended: Test using a testing suite and practice.

Testing an AI agent system that has agentic capabilities is still an emerging field, but rapid progress is being made. To supplement your testing programs (people whose job is to test your AI tools, applications, and agents), testing suites provide additional integrated support. There are two ways to think of testing suites today: synthetic and ongoing agentic.

Synthetic tests are simple — they test your AI agent against a sample of precreated prompts and ideal answers to act as a “golden set” to test against. This allows you to perform a regression test over time to validate the question, “Does our AI agent provide the correct responses?”

But synthetic regression tests are often only performed for an AI agent after some noteworthy change, such as switching out the model used or introducing a number of new use cases. Increasingly, larger testing suites are looking to test automatically and continuously. Other techniques like large language model-as-a-judge can provide supplementary runtime supervision.

(Further work is coming from Forrester on synthetic testing.)

Please note that if you do not have a formal testing program for AI systems, please either hire people for this or hire a testing services company.

For more on building tests, see Anthropic’s, “Demystifying evals for AI agents”

For more on autonomous testing: The Forrester Wave™: Autonomous Testing Platforms, Q4 2025

For how you can make continuous testing work: It’s Time To Get Really Serious About Testing Your AI: Part Two

Recommended: Test with a representative sample.

The ultimate test of your agents, however, will come from your users. They alone determine if you pass or fail. It is in your best interests to make them happy.

The question is: How do we test with real users before production? The answer is a user champion group (or similar convention). These are users who have either volunteered themselves or been selected by you to test what your agent is capable of.

This is easier in internal-facing use cases, as employee groups are more straightforward to assemble, but many customer-facing organizations can achieve the same thing through voluntary test sign-ups.

The risk is that you have users who are an overeager group who don’t make up a representative sample of your user base. In other words, they don’t necessarily represent your average user. This can be avoided through careful group design or, at least, asking users to take on a persona when conducting the test.

If this isn’t possible, you could use a canary test/conditional rollout that can serve as this testbed (though it’s better when it’s voluntary).

For more on building this user champion group internally: Best Practices For Internal Conversational AI Adoption



Source link

Tags: Agentstest
ShareTweetShare
Previous Post

Accel-backed Rentomojo files for India IPO

Next Post

The Spring Market Gets Off to a Rocky Start as the Fed Holds on Rate Cuts

Related Posts

edit post
1 Stock to Buy, 1 Stock to Sell This Week: Applied Materials, Alibaba

1 Stock to Buy, 1 Stock to Sell This Week: Applied Materials, Alibaba

by TheAdviserMagazine
May 10, 2026
0

U.S. inflation data, retail sales, U.S.-Iran developments and the Trump-Xi summit could dominate the coming week. Applied Materials stands out...

edit post
Managed Data Services for Channel Operations: The 2026 Guide to Decision-Grade Intelligence

Managed Data Services for Channel Operations: The 2026 Guide to Decision-Grade Intelligence

by TheAdviserMagazine
May 9, 2026
0

With 94% of midsize organizations now utilizing managed service providers, the shift away from internal data management isn’t just a...

edit post
Enabling Partners with Marketing Assets: The 2026 Strategic Guide

Enabling Partners with Marketing Assets: The 2026 Strategic Guide

by TheAdviserMagazine
May 8, 2026
0

While partner-involved deals are currently 32% larger and close 46% faster than direct sales, most organizations fail to capture this...

edit post
Amazon Opens Its Supply Chain Empire To All — But Is It A Fit For Your Business?

Amazon Opens Its Supply Chain Empire To All — But Is It A Fit For Your Business?

by TheAdviserMagazine
May 8, 2026
0

Amazon’s AWS Playbook: Now Applied To Supply Chain Logistics Per ShipMatrix, in 2025, Amazon surpassed the US Postal Service, FedEx,...

edit post
The Most Expensive Customer Complaint Is The One You Ignore

The Most Expensive Customer Complaint Is The One You Ignore

by TheAdviserMagazine
May 8, 2026
0

In April 2026, a JetBlue customer posted a public complaint about a sudden $230 fare increase. JetBlue replied with a...

edit post
3 Defensive Dividend Stocks to Weather Market Uncertainty

3 Defensive Dividend Stocks to Weather Market Uncertainty

by TheAdviserMagazine
May 8, 2026
0

Amid renewed market turbulence, investors are turning to time-tested defensive names. These three stocks offer resilient dividends and essential products....

Next Post
edit post
The Spring Market Gets Off to a Rocky Start as the Fed Holds on Rate Cuts

The Spring Market Gets Off to a Rocky Start as the Fed Holds on Rate Cuts

edit post
NYC is Handing Out Money to Homeowners Who Want to Build ADUs

NYC is Handing Out Money to Homeowners Who Want to Build ADUs

  • Trending
  • Comments
  • Latest
edit post
Gavin Newsom issues ‘final warning’ amid California’s dire housing crisis — what’s at stake for millions of residents

Gavin Newsom issues ‘final warning’ amid California’s dire housing crisis — what’s at stake for millions of residents

May 3, 2026
edit post
Florida Warning: With Senior SNAP Benefits Averaging 8/Month, Thousands Risk Losing Assistance in 2026

Florida Warning: With Senior SNAP Benefits Averaging $188/Month, Thousands Risk Losing Assistance in 2026

April 27, 2026
edit post
Minnesota Wealth Tax | Intangible Personal Property Tax

Minnesota Wealth Tax | Intangible Personal Property Tax

May 6, 2026
edit post
10 Cheapest High Dividend Stocks With P/E Ratios Under 10

10 Cheapest High Dividend Stocks With P/E Ratios Under 10

April 13, 2026
edit post
Exclusive: America’s largest Black-owned bank launches podcast with mission to unlock hidden shame holding back generational wealth

Exclusive: America’s largest Black-owned bank launches podcast with mission to unlock hidden shame holding back generational wealth

April 29, 2026
edit post
NYC Mayor Mamdani knocked Ken Griffin in pied-a-terre tax promo. His firm calls the move ‘shameful’

NYC Mayor Mamdani knocked Ken Griffin in pied-a-terre tax promo. His firm calls the move ‘shameful’

April 23, 2026
edit post
Economists’ Greatest Fear Is Almost Here

Economists’ Greatest Fear Is Almost Here

0
edit post
Global Market Today: Tech boost lifts Asian stocks as Iran risks push oil higher

Global Market Today: Tech boost lifts Asian stocks as Iran risks push oil higher

0
edit post
How to buy a major Dow component, at a discount

How to buy a major Dow component, at a discount

0
edit post
*HOT* Vivitar Digital Camera Binoculars with Night Vision only .99 shipped (5 value) {Today Only!}

*HOT* Vivitar Digital Camera Binoculars with Night Vision only $79.99 shipped ($435 value) {Today Only!}

0
edit post
Psychology suggests that the loneliest moment in midlife isn’t a holiday or an anniversary — it’s a regular Wednesday afternoon when you realize you don’t actually know who in your life would notice if you went quiet for a week, and the realization arrives so calmly that it takes another few weeks to admit it counts as something worth grieving

Psychology suggests that the loneliest moment in midlife isn’t a holiday or an anniversary — it’s a regular Wednesday afternoon when you realize you don’t actually know who in your life would notice if you went quiet for a week, and the realization arrives so calmly that it takes another few weeks to admit it counts as something worth grieving

0
edit post
Guide to Form 990 and Its Schedules: Understanding Nonprofit Tax Filing

Guide to Form 990 and Its Schedules: Understanding Nonprofit Tax Filing

0
edit post
Global Market Today: Tech boost lifts Asian stocks as Iran risks push oil higher

Global Market Today: Tech boost lifts Asian stocks as Iran risks push oil higher

May 10, 2026
edit post
Economists’ Greatest Fear Is Almost Here

Economists’ Greatest Fear Is Almost Here

May 10, 2026
edit post
The 4% Rule Worked in the Past. Will It Fail the Next Generation of Retirees?

The 4% Rule Worked in the Past. Will It Fail the Next Generation of Retirees?

May 10, 2026
edit post
Markets dip as US-Iran ceasefire goes nowhere, leaving Trump with a military option to reopen Hormuz

Markets dip as US-Iran ceasefire goes nowhere, leaving Trump with a military option to reopen Hormuz

May 10, 2026
edit post
Energy Secretary Says Trump ‘Open’ to Pausing Gas Tax Amid Climbing Prices

Energy Secretary Says Trump ‘Open’ to Pausing Gas Tax Amid Climbing Prices

May 10, 2026
edit post
Mah Sing sees natural ‘spillovers’ from Malaysia’s strong growth

Mah Sing sees natural ‘spillovers’ from Malaysia’s strong growth

May 10, 2026
The Adviser Magazine

The first and only national digital and print magazine that connects individuals, families, and businesses to Fee-Only financial advisers, accountants, attorneys and college guidance counselors.

CATEGORIES

  • 401k Plans
  • Business
  • College
  • Cryptocurrency
  • Economy
  • Estate Plans
  • Financial Planning
  • Investing
  • IRS & Taxes
  • Legal
  • Market Analysis
  • Markets
  • Medicare
  • Money
  • Personal Finance
  • Social Security
  • Startups
  • Stock Market
  • Trading

LATEST UPDATES

  • Global Market Today: Tech boost lifts Asian stocks as Iran risks push oil higher
  • Economists’ Greatest Fear Is Almost Here
  • The 4% Rule Worked in the Past. Will It Fail the Next Generation of Retirees?
  • Our Great Privacy Policy
  • Terms of Use, Legal Notices & Disclosures
  • Contact us
  • About Us

© Copyright 2024 All Rights Reserved
See articles for original source and related links to external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal

© Copyright 2024 All Rights Reserved
See articles for original source and related links to external sites.