5.1 — Multi-Agent WhatsApp Teams — Routing & Escalation
Multi-Agent WhatsApp Teams — Routing & Escalation
A single WhatsApp chatbot works well for simple, linear interactions — ordering food, answering FAQs, capturing basic lead info. But real businesses have complex interactions: a DHA property inquiry needs an agent who knows Karachi Phase 6 pricing, not one who handles Lahore. A hospital's WhatsApp line gets both appointment bookings and medical emergency questions — a bot handling appointments should never attempt to answer a medical query. Multi-Agent WhatsApp systems solve this through intelligent routing: the right conversation gets to the right handler, whether that's a specialized bot, a department, or a specific human agent.
Section 1: The Multi-Agent Architecture
A Multi-Agent WhatsApp system has three layers:
Layer 1 — The Router Bot: The first point of contact. Every incoming message hits the Router first. It identifies: what department does this belong to, what is the urgency level, and does this need a bot or a human? The Router should NOT answer content questions — it only classifies and routes.
Layer 2 — Specialized Agents (Bots or Humans): Each department/function has its own agent:
- Sales Bot: product info, pricing, lead capture
- Support Bot: order status, complaints, refunds
- Booking Bot: appointments, scheduling, availability
- Human Agents: complex queries, complaints, high-value leads
Layer 3 — Escalation Handler: When any agent cannot resolve an issue (3 failed attempts, explicit customer request, keyword triggers like "complaint," "refund," "manager"), the Escalation Handler takes over, assigns a ticket, and routes to the appropriate human.
Section 2: Building the Routing Logic
Step 1 — Intent Classification: Use a lightweight AI call to classify incoming messages. This runs on every message before routing:
def classify_intent(message: str) -> dict:
"""Classify WhatsApp message intent using Gemini Flash"""
from google import genai
client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
prompt = f"""Classify this WhatsApp message from a Pakistani customer.
Return JSON only:
{{
"intent": "sales|support|booking|complaint|general|emergency",
"urgency": "high|medium|low",
"language": "english|urdu|mixed",
"escalate_to_human": true|false,
"reason": "one sentence"
}}
Message: "{message}"
Pakistani business context: This is a WhatsApp line for a multi-specialty hospital in Karachi.
"""
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=prompt
)
return json.loads(response.text)
Step 2 — WATI Department Routing: WATI supports agent assignment and inbox routing. When your classification engine returns an intent, route accordingly via WATI's API:
import httpx
async def route_to_department(wa_number: str, intent: dict):
wati_headers = {"Authorization": f"Bearer {WATI_TOKEN}"}
if intent["escalate_to_human"] or intent["intent"] == "emergency":
# Assign to human agent immediately
await httpx.post(
f"{WATI_BASE}/api/v1/assignConversation",
headers=wati_headers,
json={"whatsappNumber": wa_number, "assignedTo": "on_call_agent"}
)
# Send priority message to team lead
await send_emergency_alert(wa_number, intent)
elif intent["intent"] == "sales":
await activate_sales_bot_flow(wa_number)
elif intent["intent"] == "support":
await activate_support_bot_flow(wa_number)
elif intent["intent"] == "booking":
await activate_booking_bot_flow(wa_number)
else:
await activate_general_faq_bot(wa_number)
Step 3 — Escalation Triggers: Build escalation triggers at every bot level. If a bot detects:
- Customer has asked the same question 3+ times with no resolution
- Customer has explicitly said "manager," "complaint," "refund," "not working," "very upset," "koi kaam nahi"
- The conversation has been going for 15+ minutes without completion
- Customer sends an all-caps message (frustration signal)
Then immediately: send a handoff message ("Connecting you to our team, just a moment"), assign to available human agent in WATI, notify agent via WhatsApp with conversation summary.
Section 3: Pakistan-Specific Routing Considerations
Language Detection: Pakistani customers mix English, Urdu, and Roman Urdu freely. Your routing must handle:
- "I want to check my order status" (English — route to support bot)
- "mera order kahan hai" (Roman Urdu — route to same support bot, but activate Urdu template)
- "Complaint hai mujhe" (Roman Urdu complaint signal — escalate to human)
Business Hours Routing: Pakistani businesses typically operate 10 AM–8 PM PST. Build time-based routing:
- Business hours: Route to bots first, escalate to humans if needed
- After hours: Route all to bots, flag high-priority issues for morning follow-up
- Eid/Ramadan: Adjust greeting templates, notify customers of reduced hours
Section 4: Agent Routing Decision Table
MULTI-AGENT ROUTING MATRIX (Pakistan Business Context)
┌──────────────────────────────────────────────────────┐
│ MESSAGE TYPE │ ROUTE TO │ URGENCY │
│ ────────────────────────────────────────────────── │
│ "Price for product X" │ Sales Bot │ Low │
│ "My order is late" │ Support Bot │ Medium │
│ "Book appointment" │ Booking Bot │ Low │
│ "Refund chahiye" │ Human Agent │ High │
│ "Manager se baat karo" │ Human Agent │ High │
│ "Chest pain" (hospital)│ Emergency+Human│ Critical │
│ "Same-day delivery?" │ Sales Bot │ Low │
│ "Wrong item received" │ Support Bot │ Medium │
│ "Kal delivery hogi?" │ Support Bot │ Low │
│ "Discount milega?" │ Sales Bot │ Low │
└──────────────────────────────────────────────────────┘
Practice Lab
Exercise 1: Design the routing logic for a Pakistani e-commerce business (selling clothing on Daraz/Shopify). Create a classification table: list 10 types of incoming messages and how each should be routed (which bot, which human department, what urgency level). This is your routing specification — the blueprint before you build in WATI.
Exercise 2: Build the intent classification function using Gemini Flash. Test it with 10 real WhatsApp message examples (write them yourself or use sample messages). Evaluate: does the intent classification match what a human agent would do? Fix any prompts where classification is incorrect.
Exercise 3: Configure WATI's agent routing for at least 2 departments (e.g., Sales and Support). Create a test conversation for each. Verify that a "price inquiry" message correctly flows to the Sales bot and a "my order is late" message correctly routes to the Support bot. This end-to-end test confirms your routing system works before going live.
Pakistan Case Study: Karachi Multi-Brand Agency — 500 Chats Per Day
A Karachi digital agency managing WhatsApp for 3 clients (clothing brand, restaurant, skincare clinic) was running all 3 accounts on separate WATI instances with separate teams. When one client had a surge, the team was overwhelmed; others under-utilized.
They rebuilt with a unified multi-agent router:
Architecture:
All 3 client numbers → WATI Shared Inbox
│
Router Bot classifies:
- Which client?
- What intent?
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
Clothing Bot Restaurant Bot Skincare Bot
(sales/catalog) (orders/menu) (bookings/FAQs)
│ │ │
└──────────────────┴──────────────────┘
│
Escalation Pool:
3 human agents shared
across all 3 clients
Results after 60 days:
| Metric | Before | After |
|---|---|---|
| Daily chats handled | 200 | 500 |
| Human agents needed | 9 (3 per client) | 3 (shared pool) |
| Average response time | 8 min | 45 sec (bot) |
| Escalation rate | 100% (all human) | 18% (only complex) |
| Monthly staff cost | PKR 225,000 | PKR 75,000 |
| Client satisfaction | Mixed | High |
The multi-agent router cut staffing costs by 67% while tripling capacity. The 3 human agents now handle only what bots genuinely cannot — and they do it far better because they're not burned out answering "What are your rates?" 200 times a day.
Key Takeaways
- Multi-agent routing is the feature that scales a WhatsApp business from 50 to 500+ daily conversations — without it, a single bot tries to handle everything and handles nothing well
- Intent classification using Gemini Flash is fast (< 1 second) and accurate enough for production use — it doesn't need to be perfect, just better than random routing
- Escalation triggers based on conversation patterns (3 failed attempts, frustration keywords) are more reliable than waiting for customers to explicitly ask for a human
- Language detection in a Pakistani context requires handling three languages (English, Urdu, Roman Urdu) — build templates for all three since customers will use all three in the same conversation
- The shared human agent pool across multiple bots/clients is an efficiency multiplier — agents handle only escalations, which means fewer agents handle more volume at higher quality
- Business hours routing is a mandatory feature in Pakistan: after-hours bots must set clear expectations ("We'll respond tomorrow at 10 AM") to avoid customer frustration with bot-only interactions late at night
Lesson Summary
Quiz: Multi-Agent WhatsApp Teams — Routing & Escalation
4 questions to test your understanding. Score 60% or higher to pass.