Multi-Agent WhatsApp Teams — Routing & Escalation

A single WhatsApp chatbot works well for simple, linear interactions — ordering food, answering FAQs, capturing basic lead info. But real businesses have complex interactions: a DHA property inquiry needs an agent who knows Karachi Phase 6 pricing, not one who handles Lahore. A hospital's WhatsApp line gets both appointment bookings and medical emergency questions — a bot handling appointments should never attempt to answer a medical query. Multi-Agent WhatsApp systems solve this through intelligent routing: the right conversation gets to the right handler, whether that's a specialized bot, a department, or a specific human agent.

Section 1: The Multi-Agent Architecture

A Multi-Agent WhatsApp system has three layers:

Layer 1 — The Router Bot: The first point of contact. Every incoming message hits the Router first. It identifies: what department does this belong to, what is the urgency level, and does this need a bot or a human? The Router should NOT answer content questions — it only classifies and routes.

Layer 2 — Specialized Agents (Bots or Humans): Each department/function has its own agent:

Sales Bot: product info, pricing, lead capture
Support Bot: order status, complaints, refunds
Booking Bot: appointments, scheduling, availability
Human Agents: complex queries, complaints, high-value leads

Layer 3 — Escalation Handler: When any agent cannot resolve an issue (3 failed attempts, explicit customer request, keyword triggers like "complaint," "refund," "manager"), the Escalation Handler takes over, assigns a ticket, and routes to the appropriate human.

Section 2: Building the Routing Logic

Step 1 — Intent Classification: Use a lightweight AI call to classify incoming messages. This runs on every message before routing:

python

def classify_intent(message: str) -> dict:
    """Classify WhatsApp message intent using Gemini Flash"""
    from google import genai

    client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))

    prompt = f"""Classify this WhatsApp message from a Pakistani customer.
Return JSON only:
{{
  "intent": "sales|support|booking|complaint|general|emergency",
  "urgency": "high|medium|low",
  "language": "english|urdu|mixed",
  "escalate_to_human": true|false,
  "reason": "one sentence"
}}

Message: "{message}"
Pakistani business context: This is a WhatsApp line for a multi-specialty hospital in Karachi.
"""

    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=prompt
    )
    return json.loads(response.text)

Step 2 — WATI Department Routing: WATI supports agent assignment and inbox routing. When your classification engine returns an intent, route accordingly via WATI's API:

python

import httpx

async def route_to_department(wa_number: str, intent: dict):
    wati_headers = {"Authorization": f"Bearer {WATI_TOKEN}"}

    if intent["escalate_to_human"] or intent["intent"] == "emergency":
        # Assign to human agent immediately
        await httpx.post(
            f"{WATI_BASE}/api/v1/assignConversation",
            headers=wati_headers,
            json={"whatsappNumber": wa_number, "assignedTo": "on_call_agent"}
        )
        # Send priority message to team lead
        await send_emergency_alert(wa_number, intent)

    elif intent["intent"] == "sales":
        await activate_sales_bot_flow(wa_number)

    elif intent["intent"] == "support":
        await activate_support_bot_flow(wa_number)

    elif intent["intent"] == "booking":
        await activate_booking_bot_flow(wa_number)

    else:
        await activate_general_faq_bot(wa_number)

Step 3 — Escalation Triggers: Build escalation triggers at every bot level. If a bot detects:

Customer has asked the same question 3+ times with no resolution
Customer has explicitly said "manager," "complaint," "refund," "not working," "very upset," "koi kaam nahi"
The conversation has been going for 15+ minutes without completion
Customer sends an all-caps message (frustration signal)

Then immediately: send a handoff message ("Connecting you to our team, just a moment"), assign to available human agent in WATI, notify agent via WhatsApp with conversation summary.

Section 3: Pakistan-Specific Routing Considerations

Language Detection: Pakistani customers mix English, Urdu, and Roman Urdu freely. Your routing must handle:

"I want to check my order status" (English — route to support bot)
"mera order kahan hai" (Roman Urdu — route to same support bot, but activate Urdu template)
"Complaint hai mujhe" (Roman Urdu complaint signal — escalate to human)

Business Hours Routing: Pakistani businesses typically operate 10 AM–8 PM PST. Build time-based routing:

Business hours: Route to bots first, escalate to humans if needed
After hours: Route all to bots, flag high-priority issues for morning follow-up
Eid/Ramadan: Adjust greeting templates, notify customers of reduced hours

Section 4: Agent Routing Decision Table

code

MULTI-AGENT ROUTING MATRIX (Pakistan Business Context)
┌──────────────────────────────────────────────────────┐
│  MESSAGE TYPE           │ ROUTE TO       │ URGENCY   │
│  ──────────────────────────────────────────────────  │
│  "Price for product X"  │ Sales Bot      │ Low       │
│  "My order is late"     │ Support Bot    │ Medium    │
│  "Book appointment"     │ Booking Bot    │ Low       │
│  "Refund chahiye"       │ Human Agent    │ High      │
│  "Manager se baat karo" │ Human Agent    │ High      │
│  "Chest pain" (hospital)│ Emergency+Human│ Critical  │
│  "Same-day delivery?"   │ Sales Bot      │ Low       │
│  "Wrong item received"  │ Support Bot    │ Medium    │
│  "Kal delivery hogi?"   │ Support Bot    │ Low       │
│  "Discount milega?"     │ Sales Bot      │ Low       │
└──────────────────────────────────────────────────────┘

Practice Lab

Exercise 1: Design the routing logic for a Pakistani e-commerce business (selling clothing on Daraz/Shopify). Create a classification table: list 10 types of incoming messages and how each should be routed (which bot, which human department, what urgency level). This is your routing specification — the blueprint before you build in WATI.

Exercise 2: Build the intent classification function using Gemini Flash. Test it with 10 real WhatsApp message examples (write them yourself or use sample messages). Evaluate: does the intent classification match what a human agent would do? Fix any prompts where classification is incorrect.

Exercise 3: Configure WATI's agent routing for at least 2 departments (e.g., Sales and Support). Create a test conversation for each. Verify that a "price inquiry" message correctly flows to the Sales bot and a "my order is late" message correctly routes to the Support bot. This end-to-end test confirms your routing system works before going live.

Pakistan Case Study: Karachi Multi-Brand Agency — 500 Chats Per Day

A Karachi digital agency managing WhatsApp for 3 clients (clothing brand, restaurant, skincare clinic) was running all 3 accounts on separate WATI instances with separate teams. When one client had a surge, the team was overwhelmed; others under-utilized.

They rebuilt with a unified multi-agent router:

Architecture:

code

All 3 client numbers → WATI Shared Inbox
                           │
                    Router Bot classifies:
                    - Which client?
                    - What intent?
                           │
        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
  Clothing Bot        Restaurant Bot     Skincare Bot
  (sales/catalog)     (orders/menu)      (bookings/FAQs)
        │                  │                  │
        └──────────────────┴──────────────────┘
                           │
                    Escalation Pool:
                    3 human agents shared
                    across all 3 clients

Results after 60 days:

Metric	Before	After
Daily chats handled	200	500
Human agents needed	9 (3 per client)	3 (shared pool)
Average response time	8 min	45 sec (bot)
Escalation rate	100% (all human)	18% (only complex)
Monthly staff cost	PKR 225,000	PKR 75,000
Client satisfaction	Mixed	High

The multi-agent router cut staffing costs by 67% while tripling capacity. The 3 human agents now handle only what bots genuinely cannot — and they do it far better because they're not burned out answering "What are your rates?" 200 times a day.

Key Takeaways

Multi-agent routing is the feature that scales a WhatsApp business from 50 to 500+ daily conversations — without it, a single bot tries to handle everything and handles nothing well
Intent classification using Gemini Flash is fast (< 1 second) and accurate enough for production use — it doesn't need to be perfect, just better than random routing
Escalation triggers based on conversation patterns (3 failed attempts, frustration keywords) are more reliable than waiting for customers to explicitly ask for a human
Language detection in a Pakistani context requires handling three languages (English, Urdu, Roman Urdu) — build templates for all three since customers will use all three in the same conversation
The shared human agent pool across multiple bots/clients is an efficiency multiplier — agents handle only escalations, which means fewer agents handle more volume at higher quality
Business hours routing is a mandatory feature in Pakistan: after-hours bots must set clear expectations ("We'll respond tomorrow at 10 AM") to avoid customer frustration with bot-only interactions late at night

5.1 — Multi-Agent WhatsApp Teams — Routing & Escalation