Model routing isn’t load balancing (And that’s why you’re not ready)

Multi-model AI isn’t a buzzword anymore, it’s how organizations are actually operating. In this episode of Pop Goes the Stack, Lori MacVittie and Joel Moses dig into fresh findings from F5's State of Application Strategy Report showing companies run an average of seven models, and more than half are already orchestrating multiple models together. That’s a big shift, and it changes what “infrastructure readiness” even means.
 
Why do teams chain models in the first place? The answer: cost, capability, and risk. The uncomfortable part? Most infrastructure is still built for deterministic systems, and AI routing is not the same problem as load balancing. Model routing isn’t about spreading traffic evenly. It’s about making a decision on every request: which model is best for this job, what will it cost, what’s the risk, and what’s the fallback when the answer is wrong or low quality.
 
Joel frames it as a category change, from “where should this request go?” to “what should happen as a result of this request?” That shift forces new requirements: policy enforcement across models, identity-aware access, decision justification, and mechanisms to recover when output quality degrades due to drift, configuration changes, or poisoned inputs like compromised RAG data. Lori ties it back to governance, not just availability, and why “AI workloads” expose gaps that traditional tooling can’t cover.
 
While many organizations are operationalizing AI, that doesn’t mean it’s manageable yet. If you want to know how to move forward from here, this is an episode you don't want to miss.

Get your copy of the 2026 State of Applications Strategy Report

Creators and Guests

Joel Moses
Host
Joel Moses
Distinguished Engineer and VP, Strategic Engineer at F5, Joel has over 30 years of industry experience in cybersecurity and networking fields. He holds several US patents related to encryption technique.
Lori MacVittie
Host
Lori MacVittie
Distinguished Engineer and Chief Evangelist at F5, Lori has more than 25 years of industry experience spanning application development, IT architecture, and network and systems' operation. She co-authored the CADD profile for ANSI NCITS 320-1998 and is a prolific author with books spanning security, cloud, and enterprise architecture.
Tabitha R.R. Powell
Producer
Tabitha R.R. Powell
Technical Thought Leadership Evangelist producing content that makes complex ideas clear and engaging.
Model routing isn’t load balancing (And that’s why you’re not ready)
Broadcast by