AI Red Teaming in Practice: Scores, guardrails, auto-remediation
AI in production isn’t just another feature to ship. It’s a non-deterministic system that can be socially engineered, fuzzed, and pushed into failure states you won’t find with traditional testing. Recorded live in Las Vegas at F5’s AppWorld 2026, this episode of Pop Goes the Stack brings Joel Moses together with Jimmy White, F5’s VP of AI Security (via the CalypsoAI acquisition), for a practical look at what AI red teaming actually is and how it works when the attacker is an agent.
Jimmy reframes genAI security as a permutation problem: if there are countless prompt combinations that could unlock sensitive data or trigger unsafe actions, you need genAI-powered red team agents to explore those paths at scale. The discussion covers custom intents, agentic “fingerprints” that reveal not just what was compromised but how it happened, and why that “how” is the key to building protections you can trust.
You’ll also hear how scoring and reporting translate into guardrails, how auto-remediation can be validated with positive and negative test cases before a human publishes changes, and why relying on models to internalize safety isn’t a realistic plan. The conversation closes on agentic AI risk, where tools and permissions matter more than the model’s reasoning, and introduces “thought injection” as a way to redirect unsafe actions without breaking the agent loop.
If you’re building AI apps, deploying MCP-connected systems, or worrying about agents becoming tomorrow’s service accounts, this episode gives you a sharper playbook for testing, governance, and resilience.
Creators and Guests
Host
Joel Moses
Distinguished Engineer and VP, Strategic Engineer at F5, Joel has over 30 years of industry experience in cybersecurity and networking fields. He holds several US patents related to encryption technique.
Producer
Tabitha R.R. Powell
Technical Thought Leadership Evangelist producing content that makes complex ideas clear and engaging.
