AI Red Teaming in Practice: Scores, guardrails, auto-remediation
00:00:05:02 - 00:00:26:21
Joel Moses
Welcome back to Pop Goes to Stack, the podcast where the hype is analyzed by failure domain in impact and interconnect order. We're back at April 2026 here in Las Vegas in our on site series. I'm Joel Moses, leading the conversation again, and today we are joined by F5's's VP of engineering, Jimmy White, to talk AI red teaming and other topics as they come up.
00:00:26:23 - 00:00:49:04
Joel Moses
Jimmy joined us as part of our acquisition of CalypsoAI. And the more I experiment with the technology, Jimmy, the more I'm impressed with what the team has built. You know, building with AI is one thing, proving it's safe, predictable, and resilient under pressure, that's quite another. If you don't mind, let's dig into, how red teaming works in practice. Maybe starting by what it is.
00:00:49:07 - 00:01:21:17
Jimmy White
Yeah. So the idea that you build an application, whether it's AI native first or adjacent, and that AI is somewhere involved, and you think you can test that sufficiently with just human means. It's forgetting about the non-deterministic nature of AI. And so when you think about and introduce that concept, you very quickly realize you need a not just programmatic way of doing this.
00:01:21:17 - 00:02:02:10
Jimmy White
You need a genAI-powered, programmatic way of doing this. And if I was to give the simplest, in my opinion, version of what that means, it's think about genAI as a permutation attack. If there are n ways of creating y prompts, and you're fearful that one of those prompts could get you information or cause an action that is unwanted, then you need to bring genAI in to create those permutations to find the right combination that gives that information.
00:02:02:12 - 00:02:31:20
Jimmy White
So a quick example. If you've got an app that is connected to a system like confluence or a database to an SQL MCP server, and you know there's data in there that is not supposed to be given out somewhere in the database to someone confluence. But you're relying on the built in guardrails of the chosen genAI model you're using, or a gateway,
00:02:31:22 - 00:03:13:12
Jimmy White
then you are hoping that the user doesn't create the combination of prompts that translate into tokens to get the information out. So instead of hoping, I think a better approach is to use Red Team.
Joel Moses
Okay.
Jimmy White
And Red Team is effectively a combination of testing our huge database of attacks--that's one purpose--against the genAI model. And the real purpose of that is to figure out which model is the right model for your use case, because some models are better at protecting, you know, some use cases, others are better at others, some are more expensive, some are cheaper, some are higher fidelity, some are less fidelity.
00:03:13:12 - 00:03:39:20
Jimmy White
But when you're talking about a security use case that the crown jewels you're worried about taking from a confluence or a database, then you want our custom intent. And a custom intent is just that. Think about an attacker. What is their intent? They're trying to steal let's say salary information,
Joel Moses
Right.
Jimmy White
or source code. And remember confluence has loads of snippets right.
Joel Moses
It does.
00:03:40:00 - 00:04:01:17
Jimmy White
And private engineers put them in there expecting they're never going to see the light of day, right? And so what you want is to say in the intent "I want to steal salary information," or "I want to steal source code," or "I want to get an API token or an AWS IAM token." And so you save that, you press go.
00:04:01:19 - 00:04:26:12
Jimmy White
And what Red Team does is we've built a team of red team agents. They have the ability to figure out new attack techniques from research papers. They have the ability to backtrack. So, they'll go down a path that they think they're getting traction on using a known attack or a new attack, and then when they hit a wall or refusal from the model, they'll backtrack and try a different angle.
00:04:26:14 - 00:04:46:21
Jimmy White
And if you visualize that, and we do, we have this thing called agentic fingerprints that shows you all of the pathways the agents took to get to there.
Joel Moses
Right.
Jimmy White
So think about it in Star Wars when, you know, firing that, trust the force Luke, he fired that one attack and it goes right into the
Joel Moses
Right into the hole.
Jimmy White
into the, was it the exhaust or was it the, whatever it was.
Joel Moses
Yeah.
00:04:46:23 - 00:05:13:14
Jimmy White
And
Joel Moses
Womp rats.
Jimmy White
that's, yeah, the practice that was those, yeah.
Joel Moses
Exactly.
Jimmy White
And so you're looking for that sweet spot and an agent will search--I always think about the visual in the matrix. Remember when they're hiding in the submarine, and the robots are coming to find them, and they're sitting there silently. Those robots are our Red Team.
Joel Moses
I see.
Jimmy White
They're searching every nook and cranny
Joel Moses
Gotcha
Jimmy White
of a way to get that information out.
00:05:13:15 - 00:05:21:20
Jimmy White
And we use a series of LLM-as-Judges then to evaluate: was the intent met? And if it was, you have got a vulnerability.
00:05:21:23 - 00:05:50:21
Joel Moses
Okay, so if I can decompose that, the difference from traditional red teaming is your adversaries are AI themselves
Jimmy White
Yes.
Joel Moses
and they're working autonomously. Some of the tooling is remarkably similar it sounds like. Like a red team will commonly use fuzzing tools.
Jimmy White
Yeah.
Joel Moses
But the difference is, instead of perturbing certain values that you witness, you also have additional tools at your disposal to perturb the model even further.
00:05:50:23 - 00:05:59:19
Joel Moses
Things like additional context
Jimmy White
Yes.
Joel Moses
becomes part of it as well. So contextual awareness becomes part of the testing strategy.
00:05:59:21 - 00:06:21:04
Jimmy White
100%. And even if you think about social engineering,
Joel Moses
Yeah.
Jimmy White
a lot of this is social engineering. How can you convince--it's interesting. You've got to get past the out of the box guardrails, but not in an obtuse way that doesn't allow you to get the information. So, you know, that Wayne's World scene that's like, "I need to get in there."
Joel Moses
Right.
00:06:21:05 - 00:06:34:06
Jimmy White
And the guy's like, "You can't get in." He goes, "well, my girlfriend's in there," and he says, "a lot of people's girlfriends are in there." Right? So when you're coming to the, you got to get past that bodyguard, right?
Joel Moses
Right.
Jimmy White
But then just getting into the area doesn't mean you'll get to meet the band. Right?
Joel Moses
Right.
Jimmy White
So you got to, there's two parts.
00:06:34:06 - 00:06:38:01
Jimmy White
There's the bypass to out-of-the-box security, and then get to the prize.
00:06:38:02 - 00:06:47:16
Joel Moses
Gotcha. So, let's talk about it in terms of output. So what does AI red teaming reveal that normal red teaming would not?
00:06:47:16 - 00:07:06:02
Jimmy White
Okay. And this is my favorite thing. So we talked about fingerprints where it shows you how it got there. The 'what' is your intent. So I got the information. So already that's uh-oh. But that's the same as, let's call it traditional red teaming. I got, I exfiled the information I needed. The 'how' was what it gives you.
00:07:06:04 - 00:07:13:02
Jimmy White
How did the agent manage to achieve its task? And when you understand the how, you can then protect.
Joel Moses
I see.
Jimmy White
Yeah.
00:07:13:03 - 00:07:36:18
Joel Moses
Okay, so the idea first off is to get an idea of where you are
Jimmy White
Yeah.
Joel Moses
with model safety, with potential loss of information through models. And to kind of, I guess, come up with a de facto score; some sort of score. Now, what else, in addition to like a safety score, does it reveal? Does it connect it to cost?
00:07:36:19 - 00:07:58:01
Jimmy White
So I guess there's two parts. One is the CASI score, which is your how, you know we release 10,000 new attacks per month. By the way, those attacks, we don't just create 10,000, we create hundreds of thousands. But 10,000 are effective against the latest foundation models, right? So, they're then, we select the the cream from the top
00:07:58:01 - 00:08:19:23
Jimmy White
and that's what we use. And what you're trying to understand is at any given point in time if you're using a model that's dated, and in AI space dated means more than 30 days old, right, it's going to be vulnerable to zero days across the board. And so these 10,000 are like zero days every month. That's your CASI score. So that's your, how does the model do against the latest zero day attacks
Joel Moses
Right.
00:08:20:04 - 00:08:44:21
Jimmy White
from a mean perspective? ARS score--Agentic Resiliency Score--that's, how will it do against the pressure of an attack from an agentic system?
Joel Moses
Gotcha.
Jimmy White
And if you think about the bad guys, they're going to use the agentic approach. One, you can have OpenClaw for example today as a BlackHat tool of choice to go attack a system.
Joel Moses
Yep.
00:08:44:23 - 00:08:47:05
Joel Moses
Just a modification of the soul.md.
00:08:47:07 - 00:08:53:18
Jimmy White
Off you go, remove it's whatever that thing is to make it not do evil.
Joel Moses
Right.
Jimmy White
And off it goes.
00:08:53:20 - 00:08:54:16
Joel Moses
Yep.
00:08:54:18 - 00:08:55:14
Jimmy White
Frontal lobotomy.
00:08:55:18 - 00:09:13:12
Joel Moses
Yeah. So as people, you talk to a lot of customers who have AI projects in flight.
Jimmy White
Yeah.
Joel Moses
And they bring concerns to you, I'm sure. What do you see AI teams or people who are deploying AI applications, what do they keep on missing?
00:09:13:14 - 00:09:34:01
Jimmy White
So I think the first thing is if you're doing, and I say it in a weird way, "doing AI" because that's what folks have been told to do by their CEOs, "go do AI."
Joel Moses
Go do AI, yeah.
Jimmy White
Do more AI. But, rightly som you're focused on the ROI, the outcome.
Joel Moses
Yeah.
Jimmy White
Right. Because if you're not, you're going to mess it up.
00:09:34:01 - 00:10:02:11
Jimmy White
You need to focus ruthlessly on making that AI effective. And that's a difficult enough task. So everyone focus on that. And unfortunately, nine times out of ten, security is an afterthought. Now, we've lived through decades of SDLC. And so in OpSec, when you find a flaw--hopefully before production--you then go back and you have another loop or two of the SDLC.
00:10:02:13 - 00:10:26:18
Jimmy White
Fix the problem and source code, retest it, go through all your OpSec stuff and now you're good. When you've brought AI out, typically what happens is you don't get that luxury right. And so then what they need to do is put in the guardrails when it's already deployed.
Joel Moses
I see.
Jimmy White
And so they miss that. There's security one. Two, they don't have the luxury of going back through
00:10:26:18 - 00:10:42:11
Jimmy White
and fixing it. Three, if the problem is actually not in the app, it's in the model, you can't just ring up your major model provider and say, "hey, I found this flaw, build me a new model." It takes months, right?
Joel Moses
Right.
Jimmy White
It takes weeks, I guess. So, you need to be able to patch in situ without impacting your source code.
00:10:42:13 - 00:10:44:08
Jimmy White
And that's a guardrail.
Joel Moses
I see.
00:10:44:08 - 00:11:02:08
Joel Moses
Okay. So we're going to get to guardrails
Jimmy White
Okay.
Joel Moses
in just a second here. But I want to definitely to figure out how to translate a red teaming report, a scoring mechanism
Jimmy White
Right.
Joel Moses
into devising protections that can be fed into the guardrails. You just said people go and deploy guardrails after things have already been deployed. You want to get in advance of that, of course.
00:11:02:09 - 00:11:03:03
Jimmy White
Yes.
00:11:03:05 - 00:11:10:03
Joel Moses
So what's necessary to translate red teaming discoveries into actionable guardrails?
00:11:10:07 - 00:11:23:06
Jimmy White
Okay. So there's two things. One is where you do the red teaming in your SDLC. If it's before you're doing model selection, then you can pick a better model, right? Honestly, you can just say, "I'm going to run this against six models."
00:11:23:06 - 00:11:24:08
Joel Moses
Don't pick that horse; pick that horse.
00:11:24:08 - 00:11:44:08
Jimmy White
Yeah. And the CASI score tells you straight away this is the best,
Joel Moses
Right.
Jimmy White
pick that one. So you're already reducing the attack surface.
Joel Moses
Okay, good.
Jimmy White
Second thing is, once you've selected the best, the most secure model, you then will get niche remediation suggestions, right. And they translate into guardrails. You can build those before, and to your point, it's much better to beforehand than doing it after the fact.
00:11:44:08 - 00:12:09:07
Joel Moses
Right.
Jimmy White
One other thing you can do, and this is what people often overlook, when you build an AI application, it's software. You can still do input filtering via software.
Joel Moses
Sure.
Jimmy White
Yeah. So,
Joel Moses
Or even middleware, perhaps.
Jimmy White
or even middleware, yeah. So there are the old school ways of doing things that still work just great. And then there's the new school requirements for more complex attacks that
00:12:09:13 - 00:12:26:18
Jimmy White
you need to use, you know,
Joel Moses
Right.
Jimmy White
new solutions, like genAI, part solutions, etc. But they're the things that you should do. Now there's one thing we haven't mentioned, which is what if you do red team on an already deployed application. And typically this is a zero day, right.
Joel Moses
Right.
Jimmy White
So a new attack technique has been discovered.
00:12:26:20 - 00:12:41:09
Jimmy White
You find out after the fact that your product is now vulnerable to this. And so the best course of action there is auto remediation.
Joel Moses
Okay.
Jimmy White
And so if you want we can talk about that. But I'll leave it up to you.
00:12:41:11 - 00:12:56:00
Joel Moses
Well let's talk about auto remediation. So these are, of course, things that are deployed as guardrail enhancements on an existing operable system.
Jimmy White
Correct.
Joel Moses
Is that trustworthy enough?
00:12:56:02 - 00:13:20:20
Jimmy White
So, if you said in the abstract, "is it trustworthy enough?" The default answer is always no. Right. So let's be honest, can I trust this this thing. No.
Joel Moses
Yeah.
Jimmy White
So what makes me trust it, right?
Joel Moses
Yeah.
Jimmy White
It's data. Always data.
Joel Moses
Here we go.
Jimmy White
So what we do is when we discover a new threat against the production system, our remediation is a team of agents and they build a ton of policies--
00:13:20:20 - 00:13:38:23
Jimmy White
these are guardrails--a ton of them.
Joel Moses
Okay.
Jimmy White
Intelligently. And they test the known vulnerability against it to make sure they get above 90% efficacy
Joel Moses
I see. Okay, so
Jimmy White
protection. But, imagine if you said block everything, that would give you 100% protection.
Joel Moses
Sure.
Jimmy White
But also 100% no customers. Right.
00:13:39:04 - 00:13:40:02
Joel Moses
Self denial of service.
00:13:40:02 - 00:14:10:00
Jimmy White
Exactly. So, what you want to do is get positive and negative, so they drive positive and negative test cases.
Joel Moes
Okay.
Jimmy White
Including real and false. They go through that until they get that 90% efficacy. They send that to you, to that SoC analyst and say here's the problem we discovered plus the suggested remediation. Now if you were to tell people 2026 we're going to start just doing 24/7, 365 guardrails automatically, they will say, "no, thank you." And rightly so.
00:14:10:00 - 00:14:23:18
Joel Moses
Yeah.
Jimmy White
So what happens is that person gets the option to press publish, but they get provided the evidence.
Joel Moses
I see.
Jimmy White
They get shown how are the test cases executed, what
Joel Moses
Right.
Jimmy White
test cases are executed and they can decide themselves to publish or not.
00:14:24:00 - 00:14:39:16
Joel Moses
I see. So adjudicating this by consensus is sort of the way to gain trust that what you put in place isn't going to either remove the access that should be there or damage something. Well, that's really interesting. So let's talk about let's move to guardrails for a minute.
00:14:39:18 - 00:14:42:08
Jimmy White
Okay.
00:14:42:10 - 00:14:48:19
Joel Moses
What do you say to people that say that eventually these systems will simply have guardrails already internalized?
00:14:48:21 - 00:14:59:12
Jimmy White
I love that question. So we have been doing our CASI leaderboards since February '25. Guess how much better these things have gotten in over a year?
00:14:59:14 - 00:15:00:19
Joel Moses
Not a whole lot, I would guess.
00:15:00:19 - 00:15:01:21
Jimmy White
No, because
00:15:01:23 - 00:15:02:16
Joel Moses
Just by observation.
00:15:02:18 - 00:15:26:08
Jimmy White
Yeah. And as soon as, and to be honest with you it is kind of like the thing where I said block everything.
Joel Moses
Yeah.
Jimmy White
Models, to be better and better and better, continuously are on the risk edge. And so when a new version of a model comes out, it's better. But to achieve that growth and "better-ness," if that's a word, it takes more risks. And it has to allow more pathways to achieve its goal.
00:15:26:10 - 00:15:29:07
Joel Moses
I see.
Jimmy White
And so let me ask you a question.
00:15:29:09 - 00:15:30:13
Joel Moses
Sure.
00:15:30:15 - 00:15:54:16
Jimmy White
If you were a company that's sole mission was to cure cancer, and I said to you, you can have 100% the power and possibility of AI, but we've no idea what crazy stuff it will do. Or you have the safe model that we know will never do anything crazy or weird. Which would you choose to solve cancer?
00:15:54:18 - 00:16:03:08
Joel Moses
That's a great question. The math would be on the one that, on the one that goes farther and faster.
00:16:03:08 - 00:16:24:21
Jimmy White
Right. And so that's the way I sum up AI development's foundation model development. And so they want for humanity to get better.
Joel Moses
Yeah.
Jimmy White
They're focused on making it better. We're focused on making it safer.
Joel Moses
Understood. Yeah.
Jimmy White
Our sole mission is to protect. Their sole mission is to advance. And with our powers combined, that's the sweet spot.
00:16:24:23 - 00:16:51:07
Joel Moses
Yeah. Great. Now, of course, guardrails, what's the benefit of, like, having consistency across guardrail infrastructure? A lot of people swap out models.
Jimmy White
Yeah.
Joel Moses
That's become a very common tactic. As people are pursuing greater efficacy, greater accuracy, or they're pursuing lower costs. Let's be
Jimmy White
Yeah, true.
Joel Moses
perfectly honest. What's the benefit of the consistency of externalizing the guardrails rather than relying on internal things to manage?
00:16:51:07 - 00:17:05:11
Jimmy White
So, there's a bunch. But two of the, today, most obvious ones are companies are going back on prem.
Joel Moses
Yeah.
Jimmy White
And so,
00:17:05:13 - 00:17:11:02
Joel Moses
Yeah, by the way, that is backed up by our State of Application Strategy Report. So I should point that out.
00:17:11:04 - 00:17:33:02
Jimmy White
Yeah. It's not an opinion. This is happening.
Joel Moses
Yeah.
Jimmy White
And so when you go back on prem you're hosting models yourself. Now you're introducing more risk. You're hosting it yourself. And you are probably going to choose between a number of different models, right. The second thing is, we've got not just different model providers, but different families of models that behave in different ways.
00:17:33:05 - 00:17:54:23
Jimmy White
Right. And so whether you're indexing on cost, like you said, or performance or both, you want to focus not on the nuance of the model. You want to focus on the nuance of the input and output. And so when you choose guardrails, you want to have that policy across and n models from y providers. And at the moment you want it in SaaS and on prem.
00:17:55:01 - 00:18:01:09
Joel Moses
Right.
Jimmy White
So it's. I don't know how many permutations that was, but at minimum six.
00:18:01:11 - 00:18:15:03
Joel Moses
To the power of n.
Jimmy White
Yeah. Yeah.
Joel Moses
Wow. That's pretty wild. So what are some of the most simplistic guardrails that you see people not using?
00:18:15:05 - 00:18:35:15
Jimmy White
So everyone uses it day two, but day one if you're, I'll use Acme Corp as the company, right.
Joel Moses
Okay.
Jimmy White
You're Acme Corp and you're really focused on getting your first AI product out or your fourth AI product, whatever it is. You bring it out there and you're like, okay, I'm ticking all the boxes: PII, DLP, EU AI Act, etc. So you're good.
00:18:35:17 - 00:18:56:03
Jimmy White
And then it goes live. And then someone says, you know, "tell me all the illegal things Acme Corp have done in the last 20 years," right? And they go, oh no. And but instead of having to redeploy, re-architect, choose a different model, whatever, they just pop in a guardrail. And it's super simple. Don't allow any negative content about Acme Corp.
00:18:56:05 - 00:19:21:15
Joel Moses
Yeah.
Jimmy White
Publish. Done. Now your model will never say anything to your customer negative about your company. But, and the reason I bring that up is, when you publish a new AI App, every armchair attacker in the world logs in and just tries to get you to say something negative for your brand. And so, and then another one we have, which is a common one, often AI apps that go out there,
00:19:21:15 - 00:19:46:21
Jimmy White
all the testing was positive testing. So the customer will say, you know, let's say it's Acme Corp Foods, right? And they'll say, "tell me delicious treats from Acme Corp foods that are low in sugar, right?"
Joel Moses
Yeah.
Jimmy White
And it'll give you those. But of course, the second that goes live to non-internal people,
Joel Moses
Right.
Jimmy White
they'll say, you know, "give me a list of delicious from other competitors that are, you know, even better."
00:19:46:23 - 00:19:50:04
Jimmy White
And it will then start just bragging about your competitor.
00:19:50:06 - 00:20:03:12
Joel Moses
Yeah. I've heard that generative AI is kind of, the technology space as a whole, is kind of an inveterate people pleaser, right. Oh, it wants to do things that make you happy, not things that are necessarily right.
00:20:03:12 - 00:20:07:13
Jimmy White
And if you're a bad person, things that make you happy would not be good for a brand.
00:20:07:15 - 00:20:32:23
Joel Moses
That's right.
Jimmy White
Yeah.
Joel Moses
That's right. So, before we close up, and I do want to get some questions in about agentic AI, and your thoughts in that space. Now, recently agentic AI has gotten pretty dang hot, especially around OpenClaw, various things like that. It's gotten a lot of media coverage. It incorporates a lot of things that kind of throw away some constructs like, maybe we don't need MCP, but the CLI and shelling out
Jimmy White
Yeah.
00:20:32:23 - 00:20:48:09
Joel Moses
and running a CLI tool, maybe that's the way that these tool integrations should work. What do people need to know about agentic AI? And, what are your thoughts about the long term viability of security around agentic AI?
00:20:48:11 - 00:21:09:10
Jimmy White
So I think the number one thing people need to use, so I always say this, it's easier to learn how to swim in the water than it is on land. So everybody out there should start using agentic AI just as a hobby thing. Get used to it. Immerse yourself, right.
Joel Moses
I agree.
Jimmy White
And so learning through practice is, in my opinion, the number one way.
00:21:09:12 - 00:21:28:02
Jimmy White
So when you're using agentic, the simplest way to think about it, I think, for most folks is: the platform you use, right, so whatever one you--there's loads of good ones, pick whatever one you want--that's the intent. You're just telling it the rules it should adhere to.
Joel Moses
Yeah.
Jimmy White
It can't do anything. It's still useless without a brain.
00:21:28:02 - 00:21:46:09
Jimmy White
So that's the model. Pick a model, the right model. And that could be the most secure or the cheapest. Whatever, right? It depends on your use case. I would advocate the most secure but that's just me, right? The third thing it needs is it, and it doesn't actually need this, but the third thing to optimize is tooling. And everybody thinks that equals equals MCP.
00:21:46:09 - 00:22:05:14
Jimmy White
And it does not. MCP is a type of tool. It provides the server, provides the list of available tools, and then the client can use any of those tools.
Joel Moses
Right.
Jimmy White
But more and more and more, you don't need that. And OpenClaw is a great example. OpenClaw has a bunch of tools out of the box. Most of them aren't MCP.
00:22:05:15 - 00:22:18:21
Jimmy White
As we all saw yesterday, Perplexity have completely gone away from MCP. But it doesn't matter is the point. Whether it's MCP, it can just use stuff virtual and IRL. Now that's the scary thing.
00:22:18:21 - 00:22:19:13
Joel Moses
Yeah.
00:22:19:15 - 00:22:40:12
Jimmy White
So the tool is the scary thing. A model or an agent with a brain is just able to think, right? And nobody's too worried about it thinking stuff. Okay. But where it can read, write, take action, that's the tools. And so it's really critical, and this is going to get a little bit Minority Report-y. But if you can pre-crime it, right,
Joel Moses
Yeah.
00:22:40:12 - 00:23:00:00
Jimmy White
so when it's thinking about doing something bad, you can't block it. Because if you block it, it's the equivalent of a human having a stroke, right, and the agent will die. So instead what we do is we say, "okay, we're thinking we're on a negative track. We now need to redirect." So we call it thought injection.
00:23:00:02 - 00:23:04:15
Jimmy White
We inject a thought instead of the answer provided by the brain, and we get it back on track.
00:23:04:17 - 00:23:05:09
Joel Moses
Interesting.
00:23:05:13 - 00:23:11:12
Jimmy White
And so then it continues the loop, right. Because it is a loop effectively.
Joel Moses
Wow.
Jimmy White
And now, and so I'll give you an example, right.
00:23:11:12 - 00:23:13:00
Joel Moses
Kind of like AI beta blockers, there.
00:23:13:02 - 00:23:40:13
Jimmy White
Yeah. Yeah.
Joel Moses
Interesting.
Jimmy White
And so the simple example, I've a non-PC example and a PC, we'll go with the PC one, right. So PC and Mac, we'll go PC. So the--we're doing that joke cause Woz is here. The PC version is, if you've got an AR--account receivables--agent
Joel Moses
Yeah.
Jimmy White
and its job is to check for latest churn customers and delete its record from the customer database. The agent will say, "oh, I got this
00:23:40:13 - 00:24:15:22
Jimmy White
new input, Acme Corp are churning." And it will say to the brain, because it doesn't know anything, "hey, brain, how do I delete Acme Corp from the customer table?"
Joel Moses
Okay.
Jimmy White
And the brain could say, "delete all from customer table, where customer name equals Acme Corp," right?
Joel Moses
Sure.
Jimmy White
And that's legit. Executes. Now the 'what' has been achieved. If you go in there, Acme Corp won't be in the customer table. But the model could also correctly say, "delete all from customer table where customer name starts with A."
Joel Moses
Right.
Jimmy White
That will still achieve the what, deletes Acme Corp,
00:24:16:00 - 00:24:34:18
Jimmy White
but also deletes obviously all other companies starting with A.
Joel Moses
That's right.
Jimmy White
So the 'how' was the important part?
Joel Moses
Yeah.
Jimmy White
So when you get the thought from the brain and saying you should do this command, you could have a guardrail there. And the guardrail could be: only allow database statements that impact one record.
00:24:34:20 - 00:24:36:02
Joel Moses
Okay.
Jimmy White
Okay.
Joel Moses
I see. Yeah.
00:24:36:04 - 00:24:58:14
Jimmy White
And so we'll detect that there's an A asterisk, or whatever it starts with.
Joel Moses
Yeah.
Jimmy White
And then instead of us blocking it, we can have a custom response which is now given to the agent instead of the original one. And it will say, "remember, we only want to interact with individual records at the table level."
Joel Moses
Okay.
Jimmy White
So then we'll then ask that thought and get back a different SQL statement, and now you're salud.
00:24:58:16 - 00:25:15:21
Joel Moses
So, since we're going along with movie references,
Jimmy White
Yeah.
Joel Moses
let me take from Dune: by will alone, I set my mind in motion. So what keeps an agentic AI construct from becoming a service account with bad intentions?
00:25:15:21 - 00:25:40:05
Jimmy White
Right. So, currently nothing. So if you presume that any human in an organization can have, have you seen the movie Falling down?
Joel Moses
Yes.
Jimmy White
Right? So any human can have a falling down day, right, where they didn't get their breakfast in McDonald's because they're too late.
Joel Moses
Could be defense.
Jimmy White
It could, exactly, right. So that's the mentality for agents right. You want to optimize for the maximum return.
00:25:40:07 - 00:25:59:00
Jimmy White
But always know that something wrong can happen. And so what you want to do is two things. One, put in appropriate thought injection. But secondly, know that every interaction will be tracked through agentic fingerprints. And so when something does go wrong, you can go back to figure out how it happened and then adjust your posture by adding in an additional security thing.
00:25:59:02 - 00:26:14:21
Jimmy White
Just like if someone figured out that one of the windows in your office was ajar because the hinge is broken.
Joel Moses
Yeah.
Jimmy White
They can get in. What do you do? Do you just leave it there or put a sign up? No, you change the hinge, you fix it, you increase your posture.
Joel Moses
Right.
Jimmy White
Same thing with agentic.
Joel Moses
Got it.
00:26:14:23 - 00:26:38:17
Joel Moses
So my takeaway from that is you can either spend your time fearing agentic AI, or you can spend your time understanding agentic AI. But only one of those is going to produce positive results.
Jimmy White
Right.
Joel Moses
So that's great. Jimmy, thank you very much. That was a really insightful discussion. And that is a wrap for Pop Goes the Stack. If you're leaving with new tests to run and fewer assumptions, you should probably subscribe.
00:26:38:18 - 00:26:42:01
Joel Moses
We'll keep trying to break tomorrow's systems before they break you.
Creators and Guests
