4-Tier AI Architecture — Saste Se Mehnge Tak

Chalo bhai, welcome back.

COURSE: Pakistan Ka Pehla Professional Trading Bot Course MODULE: AI Signal Engine — 4-Tier Decision System Banana (Module 4) LESSON: 4.1 — 4-Tier AI Architecture: Saste Se Mehnge Tak

Introduction: Paison Ka Khel Hai, Aur Humein Smart Khelna Hai

As-salamu alaykum, developers. Aaj ka lesson shayad is pooray course ka sab se important lesson hai, especially for us in Pakistan. Kyun? Kyun ke yahan har cheez dollar rate pe chalti hai, aur AI models chalana sasta kaam nahi. Aap ek zabardast bot bana sakte ho jo har market ko dunya ke sab se powerful AI, Claude Opus, se analyze kare, lekin mahine ke end pe jab aap ko 40,000 PKR ka bill aayega, to saara josh thanda ho jayega.

This isn't about being cheap; it's about being efficient. Aik professional system bananay ka matlab hai performance bhi ho aur cost bhi control mein ho. Aaj hum seekhein ge ke aik multi-tier AI system kaise banate hain jo saste models se shuru hota hai aur sirf zaroorat parne par mehnge models ko kaam pe lagata hai. This is the secret to building a bot that can run 24/7 without making you bankrupt.

The 'Kyun': Why a 4-Tier System?

Socho isko aik company ki tarah.

Interns (Tier 1): Inko aap wo kaam dete ho jo time-consuming hai lekin simple hai. "Jao, market se 100 cheezon ki list bana ke lao."
Junior Analysts (Tier 2): Interns ki list mein se, yeh log thori research kar ke "top 15" candidates nikalte hain.
Senior Manager (Tier 3): Yeh experienced banda top 15 ko dekh kar final decision leta hai ke "in 3 cheezon pe invest karna hai."
CEO (Tier 4): CEO har choti cheez mein involve nahi hota. Lekin jab koi bohot ajeeb, high-stakes situation aati hai, to sab uske paas jaate hain.

Agar aap har kaam CEO se karwao ge, to company doob jayegi. Simple. Hamara AI bot bhi aisi hi company hai. We use the right tool (and cost) for the right job.

The goal is Progressive Filtering. Hum 100 markets se shuru karte hain aur har tier pe bekaar options ko filter karte jaate hain, taake hamara sab se mehanga aur smart model sirf un 2-3 markets ko dekhe jin mein waqai koi potential hai.

Tier 1: The Sasta Grunt Worker (Gemini Flash)

Yeh hamara intern hai. Iska kaam hai shor mein se signal dhoondna. Isko hum har market pe chalate hain jo hamara scanner.py uthata hai. Iska sawal simple hota hai: "Is there anything remotely interesting here? Yes or No?"

Model: Google Gemini Flash/Pro (via Google AI Studio Free Tier)
Role: Initial screening, news headline sentiment, basic pattern check.
Cost: FREE! (with rate limits, jo hamare liye aam taur pe kaafi hain)
Kab Chalta Hai: On all 100+ markets, every single scan cycle.

Let's see how we implement this. Hamare ai/gemini.py file mein aesa function ho sakta hai:

python

import os
import json
import google.generativeai as genai

# NOTE: Apni API key environment variable mein set karna!
# export GOOGLE_API_KEY="YOUR_API_KEY"
try:
    genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
except KeyError:
    print("Bhai, GOOGLE_API_KEY environment variable set karo pehle.")
    # Exit or handle gracefully in a real app
    exit()

MODEL = "gemini-1.5-flash-latest"

def get_initial_filter(market_title: str, recent_news: list[str]) -> dict:
    """
    Tier 1 AI: Use free Gemini model for a quick, cheap first pass.
    """
    # Simple prompt, no fancy stuff
    prompt = f"""
    Market Title: "{market_title}"
    Recent News Headlines: {', '.join(recent_news)}

    Based on the above, is this market worth a deeper look for a short-term trade?
    Consider volatility, news sentiment, and if the topic is currently active.
    Answer in JSON format with two keys: "proceed" (boolean) and "reason" (string, max 15 words).
    Example: {{"proceed": true, "reason": "High volatility and recent positive news."}}
    """
    
    try:
        model = genai.GenerativeModel(MODEL)
        response = model.generate_content(prompt)
        # Basic parsing, real code mein error handling zaroori hai
        result = json.loads(response.text)
        return result
    except Exception as e:
        print(f"Gemini call fail ho gaya: {e}")
        return {"proceed": False, "reason": "API error"}

# --- Example Usage ---
# Yeh data hamara scanner.py laaye ga
market_data = {
    "title": "Will Pakistan win the next T20 match against India?",
    "news": ["Babar Azam scores a century in practice match.", "Shaheen Afridi declared fit."]
}

# Chala ke dekhte hain
initial_assessment = get_initial_filter(market_data["title"], market_data["news"])
print(f"Tier 1 (Gemini) Assessment: {initial_assessment}")

# Output (example):
# Tier 1 (Gemini) Assessment: {'proceed': True, 'reason': 'Positive player news indicates potential for market movement.'}

Dekha? Simple. scanner.py is function ko har market ke liye call karega. Jin markets ke liye proceed True hoga, sirf wohi aage Tier 2 mein jayengi. 100 markets mein se shayad 15-20 aage jaati hain.

Tier 2: The Junior Analyst (Claude Haiku)

Ab scene mein aate hain hamare junior analyst. Yeh Gemini se thora mehanga hai, lekin bohot fast aur kafi smart hai. Iska kaam hai un 15-20 candidates ko aue gehraai mein dekhna.

Model: Claude 3 Haiku
Role: Deeper analysis. Price action, volume data, social media sentiment (agar hai), aur hamari basic strategy ke rules check karna.
Cost: ~$0.002 per call (bohot sasta)
Kab Chalta Hai: Sirf un markets pe jinhe Gemini ne "proceed: true" bola hai.

Hamari ai/haiku.py file mein aesa function hoga. Note karo ke isko hum zyada data de rahe hain.

python

import os
import json
import anthropic

# NOTE: Apni API key environment variable mein set karna!
# export ANTHROPIC_API_KEY="YOUR_API_KEY"
try:
    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
except KeyError:
    print("Bhai, ANTHROPIC_API_KEY environment variable set karo pehle.")
    exit()

MODEL = "claude-3-haiku-20240307"

def get_candidate_analysis(market_title: str, price_history: list[float], gemini_reason: str) -> dict:
    """
    Tier 2 AI: Use fast and cheap Haiku for detailed analysis of promising candidates.
    """
    prompt = f"""
    You are a junior trading analyst. Your job is to filter out weak candidates.
    
    Market: "{market_title}"
    Tier 1 Reason to Proceed: "{gemini_reason}"
    Recent Price History (last 24 hours, in cents): {price_history}
    
    Analyze the provided data. Is there a clear, actionable pattern or trend based on our simple momentum strategy?
    - Look for consistent upward or downward movement.
    - Avoid markets that are flat or extremely erratic.
    
    Provide your analysis in a JSON object with three keys:
    1. "is_strong_candidate" (boolean): True only if a clear pattern exists.
    2. "confidence_score" (float): A score from 0.0 to 1.0.
    3. "analysis_summary" (string): A 2-sentence summary of your findings.
    """
    
    try:
        message = client.messages.create(
            model=MODEL,
            max_tokens=200

Multi-tier AI systems allow cost-efficient filtering without sacrificing decision quality.

---

## 📺 Recommended Videos & Resources
- **[Claude API Documentation](https://docs.anthropic.com/)** — Anthropic Claude models and API
  - Type: Official Documentation
  - Link description: Reference for Claude 3 Haiku and other models
- **[Gemini AI Studio](https://aistudio.google.com/)** — Google Gemini free tier access
  - Type: Official Tool
  - Link description: Test Gemini models with no API key required initially
- **[LLM Cost Comparison & Optimization](https://www.youtube.com/results?search_query=llm+api+cost+optimization+claude+gemini)** — Choosing cost-effective models
  - Type: YouTube
  - Link description: Search "LLM cost comparison and optimization"
- **[Prompt Engineering for Trading](https://en.wikipedia.org/wiki/Prompt_engineering)** — Crafting effective prompts
  - Type: Wikipedia
  - Link description: Learn structured prompting for consistent model outputs
- **[Structured JSON Outputs from LLMs](https://www.youtube.com/results?search_query=claude+json+output+parsing)** — Getting parseable model responses
  - Type: YouTube
  - Link description: Search "Claude JSON mode structured outputs"

---

## 🎯 Mini-Challenge
**5-Minute Practical Task:** Write a function that calls both Gemini Flash (free tier) and Claude Haiku to analyze the same market. Compare their responses and confidenceScores. Return which model's analysis to trust more, and calculate the cost difference (Gemini free vs Claude paid per call).

---

## 🖼️ Visual Reference

📊 4-Tier AI Filtering System Cost vs Quality ┌──────────────────────────────────────┐ │ Tier 1: Gemini Flash (FREE) │ │ Cost: $0.00 per call │ │ Speed: ~0.5 sec │ │ Accuracy: 70% (screening only) │ │ │ │ → Filters 100 markets → 20 candidates│ └──────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ Tier 2: Claude Haiku (CHEAP) │ │ Cost: $0.002 per call │ │ Speed: ~1 sec │ │ Accuracy: 85% (deeper analysis) │ │ │ │ → Filters 20 markets → 5 candidates │ └──────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ Tier 3: Claude Sonnet (MEDIUM) │ │ Cost: $0.01 per call │ │ Speed: ~2 sec │ │ Accuracy: 92% (advanced reasoning) │ │ │ │ → Filters 5 markets → 2 candidates │ └──────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ Tier 4: Claude Opus (EXPENSIVE) │ │ Cost: $0.08 per call │ │ Speed: ~3 sec │ │ Accuracy: 99% (final decision) │ │ │ │ → Final decision on 2 markets → TRADE│ └──────────────────────────────────────┘

code

---

4.1 — 4-Tier AI Architecture — Saste Se Mehnge Tak

4-Tier AI Architecture — Saste Se Mehnge Tak

Introduction: Paison Ka Khel Hai, Aur Humein Smart Khelna Hai

The 'Kyun': Why a 4-Tier System?

Tier 1: The Sasta Grunt Worker (Gemini Flash)

Tier 2: The Junior Analyst (Claude Haiku)

Lesson Summary

Quiz: 4-Tier AI Architecture — Saste Se Mehnge Tak