1.2 — AI Video Tools Landscape — Veo vs Runway vs HeyGen
AI Video Tools Landscape
Creating professional faceless videos used to require a team — scriptwriters, voiceover artists, video editors, and animators charging combined rates of PKR 50,000–200,000 per video. Today, one person with a laptop and a modest internet connection can produce studio-quality videos in under 2 hours. The toolkit revolution driving Pakistan's creator economy explosion did not happen gradually — it happened in a single 18-month window between 2024 and 2026. The tools that once cost USD 2,000/month are now free or under USD 100/month. This lesson gives you the complete production stack, tool-by-tool setup instructions, and a benchmarked workflow to get your first video from blank page to uploaded within your first session.
The 5-Layer Production Pipeline
Every faceless video moves through exactly five layers. Understand each layer, know your tool options at each layer, and you will never feel stuck in production.
AI VIDEO PRODUCTION PIPELINE (2026)
════════════════════════════════════════════════════════════════
INPUT: IDEA / NICHE TOPIC
│
▼
┌──────────────────────────────────────────────────────────┐
│ LAYER 1: SCRIPT GENERATION │
│ ├── Gemini 2.5 Flash (free, 15 req/day) │
│ ├── Gemini 2.5 Pro (free tier via AI Studio) │
│ ├── ChatGPT-4o (USD 20/month) │
│ └── Claude Sonnet (USD 20/month — best long-form) │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ LAYER 2: VOICEOVER SYNTHESIS │
│ ├── ElevenLabs (free: 10k char/month; paid: USD 11/mo) │
│ ├── Google Cloud TTS (free: 4M char/month) │
│ ├── Descript (USD 24/month — captions included) │
│ └── Murf.ai (USD 19/month — 60+ voices) │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ LAYER 3: VISUAL GENERATION │
│ ├── Pexels / Pixabay / Unsplash (free stock) │
│ ├── Imagen 4.0 Ultra (via Google AI Studio, free tier) │
│ ├── Runway Gen-3 (USD 12/month — text-to-video) │
│ └── CapCut template library (free — 50k+ backgrounds) │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ LAYER 4: VIDEO ASSEMBLY │
│ ├── CapCut (free, no watermark, best for beginners) │
│ ├── DaVinci Resolve (free, industry-standard grading) │
│ ├── HeyGen (USD 29/month — AI avatar auto-animation) │
│ └── Adobe Premiere Pro (USD 55/month — professional) │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ LAYER 5: DISTRIBUTION & OPTIMIZATION │
│ ├── YouTube Studio (free — primary platform) │
│ ├── TubeBuddy (USD 10/month — SEO + A/B thumbnails) │
│ ├── Buffer (USD 5/month — schedule TikTok + Instagram) │
│ └── vidIQ (free tier — keyword research + analytics) │
└──────────────────────────────────────────────────────────┘
│
▼
OUTPUT: UPLOADED VIDEO → ALGORITHM DISTRIBUTION
════════════════════════════════════════════════════════════════
Tool Comparison: Free vs. Paid (PKR Pricing)
All USD costs converted at PKR 280/USD (2026 rate):
| Layer | Free Option | Quality | Paid Option | Cost/month | Cost (PKR) | Quality Gain |
|---|---|---|---|---|---|---|
| Script | Gemini 2.5 Flash | 7/10 | ChatGPT-4o | USD 20 | PKR 5,600 | Brand memory, custom tone |
| Script | Google Gemini | 7/10 | Claude Sonnet | USD 20 | PKR 5,600 | Long-form coherence |
| Voiceover | Google TTS | 5/10 | ElevenLabs Starter | USD 11 | PKR 3,080 | Human-quality voice |
| Voiceover | ElevenLabs free | 8/10 | ElevenLabs Creator | USD 22 | PKR 6,160 | 100k more characters |
| Visuals | Pexels / Pixabay | 6/10 | Runway Gen-3 | USD 12 | PKR 3,360 | Custom AI video clips |
| Visuals | CapCut templates | 6/10 | Imagen 4.0 Ultra | Free tier | PKR 0 | Photorealistic images |
| Assembly | CapCut free | 8/10 | DaVinci Resolve | Free | PKR 0 | Color grading upgrade |
| Assembly | DaVinci Resolve | 8/10 | Adobe Premiere | USD 55 | PKR 15,400 | Advanced effects only |
| Distribution | YouTube Studio | 9/10 | TubeBuddy Pro | USD 10 | PKR 2,800 | Keyword + thumbnail A/B |
| Distribution | vidIQ free | 7/10 | Buffer | USD 5 | PKR 1,400 | Multi-platform scheduling |
Bottom line: For PKR 3,080/month (ElevenLabs Starter only), you upgrade from 6/10 amateur quality to 8.5/10 professional quality. That single upgrade pays for itself with 10,000 additional views on your first video.
Per-Tool Setup Guide
Layer 1: Script Writing with Gemini 2.5 Flash
Setup (2 minutes):
- Go to aistudio.google.com — sign in with Google account
- Select model: Gemini 2.5 Flash
- Set temperature to 0.8 (creative but controlled)
- Bookmark the prompt you use most often
Master prompt template for faceless YouTube:
You are a YouTube scriptwriter for a faceless educational channel
targeting Pakistani audiences aged 18–35.
Video topic: [YOUR TOPIC]
Video length: [X] minutes
Tone: [motivational / educational / news-style]
Language: [English / Urdu / mixed]
Script requirements:
- Hook in the first 15 seconds (question or shocking stat)
- Clear structure: Hook → Problem → Solution → Takeaways → CTA
- Timestamps every 90 seconds
- Include 3 actionable takeaways viewers can implement today
- End with a specific call to action (subscribe + comment prompt)
- Word count: [LENGTH × 165 words] (for conversational 165 WPM pacing)
Add [PAUSE] markers where the voiceover should breathe between sections.
Layer 2: ElevenLabs Voiceover
Setup (5 minutes):
- Go to elevenlabs.io — sign up free (no credit card needed)
- Navigate to "Text to Speech" in left sidebar
- Browse Voice Library — filter by language: Urdu or English
- Select a voice — click "Preview" to test with a sample sentence
- For Pakistani content: try "Aditi" (Urdu warm female) or "Brian" (English authoritative male)
- Paste your script — click Generate
- Download MP3
Free tier limit: 10,000 characters/month (approx. 2 full 6-minute videos). Upgrade to Starter (USD 11/month) for 30,000 characters — about 7 full videos monthly.
Layer 3: Visuals with Pexels + Imagen 4.0
Pexels setup (2 minutes):
- Go to pexels.com — no account required for downloads
- Search by keyword matching your script sections (e.g., "cryptocurrency trading", "Pakistan Karachi city")
- Filter: Videos → HD → Free commercial license
- Download 10–15 clips per video — you will use 60–70% of them
Imagen 4.0 setup (5 minutes):
- Go to aistudio.google.com
- Select "Generate Images" → Model: Imagen 4.0
- Prompt format: "[Scene description], cinematic lighting, high resolution, photorealistic, --style [documentary/editorial/dramatic]"
- Example: "Karachi skyline at golden hour with blockchain nodes overlaid, cinematic, photorealistic"
- Download and use as custom thumbnail or B-roll overlay
Layer 4: CapCut Assembly
Setup (3 minutes):
- Download CapCut from capcut.com (Windows/Mac/Mobile — all free, no watermark)
- Create new project → select 16:9 ratio for YouTube, 9:16 for Shorts/Reels
- Import your voiceover MP3 and all stock footage
- Drag voiceover to audio track — it auto-sets your video length
- Add footage above audio: each clip should be 3–8 seconds to maintain energy
- Enable "Auto Captions" (CapCut reads your audio and adds animated subtitles in 1 click)
- Export: 1080p, 30fps, H.264 codec
CapCut's killer feature for faceless channels: The "Auto Reframe" tool auto-crops 16:9 footage into 9:16 for Shorts — one video becomes two posts with one click.
Timing Benchmark: Script to Upload
Professional workflow for a 6-minute faceless video:
PRODUCTION TIMING BREAKDOWN
═══════════════════════════════════════════════════
TASK TIME TOOL
─────────────────────────────────────────────
Generate script 5 min Gemini 2.5 Flash
Edit script (pacing/Urdu-isms) 10 min Manual review
Generate voiceover 2 min ElevenLabs
Download 15 stock footage clips 25 min Pexels / Pixabay
Generate 2 custom AI images 5 min Imagen 4.0
Assemble in CapCut 35 min CapCut
Add auto-captions + review 8 min CapCut
Create thumbnail in Canva 7 min Canva free
Export video 5 min CapCut
Upload to YouTube + metadata 10 min YouTube Studio
─────────────────────────────────────────────
TOTAL (Video 1) 112 min ~1h 52min
TOTAL (Video 10, with templates) 55 min ~55 min
TOTAL (Video 30, full system) 40 min ~40 min
═══════════════════════════════════════════════════
The efficiency gains compound. By video 10, you have reusable thumbnail templates, a folder of unused stock footage, and muscle memory for CapCut shortcuts. By video 30, the entire process is reflex-level.
Budget Tiers: What You Actually Need to Spend
Tier 0: Zero Budget (Start Here)
| Tool | Cost | Use |
|---|---|---|
| Gemini 2.5 Flash | Free | Script generation |
| Google TTS | Free | Voiceover (robotic quality) |
| Pexels / Pixabay | Free | Stock footage |
| CapCut | Free | Video editing + captions |
| YouTube Studio | Free | Upload + analytics |
| Total | PKR 0 | Quality: 6/10 |
Limitation: Google TTS sounds noticeably robotic. Viewers notice. Watch time drops 15–20% vs. ElevenLabs voice. Use Tier 0 only for your first 1–2 test videos.
Tier 1: Minimum Viable Professional (Recommended Start)
| Tool | USD/month | PKR/month | Use |
|---|---|---|---|
| ElevenLabs Starter | USD 11 | PKR 3,080 | Human-quality voiceover |
| All other tools | Free | PKR 0 | Script, visuals, editing |
| Total | USD 11 | PKR 3,080 | Quality: 8.5/10 |
This is the sweet spot. One tool upgrade transforms your videos from amateur to professional. ElevenLabs alone can increase your average view duration by 20–30% vs. robotic TTS.
Tier 2: Full Professional Stack
| Tool | USD/month | PKR/month | Use |
|---|---|---|---|
| ElevenLabs Creator | USD 22 | PKR 6,160 | High-volume voiceover |
| Runway Gen-3 | USD 12 | PKR 3,360 | Custom AI video clips |
| TubeBuddy Pro | USD 10 | PKR 2,800 | YouTube SEO optimization |
| Buffer | USD 5 | PKR 1,400 | Multi-platform scheduling |
| Total | USD 49 | PKR 13,720 | Quality: 9.5/10 |
ROI break-even: 25,000 views/month (PKR 15,000–25,000 ad revenue). At 30 videos/month with consistent uploads, most creators hit this by month 2.
Practice Lab
Task 1: Stack Setup Create your personal AI toolkit today. Complete these sign-ups in order: (1) Google AI Studio at aistudio.google.com — test with one script generation, (2) ElevenLabs free account at elevenlabs.io — generate one 100-word voiceover sample, (3) CapCut download at capcut.com — create a blank 6-second test project, (4) Pexels account at pexels.com — download 5 stock footage clips matching your niche. Final test: Combine everything into a 30-second clip. No polishing required — just confirm the pipeline works end to end.
Task 2: Speed Benchmark Time yourself creating one complete video from script to upload. Start your timer when you open Gemini. Stop when you click "Publish" on YouTube. Record where you lose the most time — script editing, footage hunting, or CapCut assembly. This is your personal bottleneck. Every future efficiency gain comes from attacking that specific bottleneck with either better tools or outsourcing.
Task 3: Cost-to-Revenue Calculation Using your niche's estimated CPM from the table earlier, calculate: (a) how many monthly views you need at Tier 1 budget (PKR 3,080/month) to break even, (b) how many videos per month at your target length that requires, and (c) how long at your current production speed before you hit break-even views. This math tells you exactly whether to start free or invest in Tier 1 immediately.
Pakistan Case Study: "Tech Pakistan Daily"
Fatima Zaidi, a former news anchor from Islamabad, lost her television contract in late 2024 when her station cut 40% of its staff to manage rising production costs. She had 12 years of journalism experience, a sharp analytical mind, and zero video editing skills. "Main ne socha tha yeh sab mere liye nahin hai" ("I thought all of this wasn't for me"), she said in a later interview — meaning the technical side of YouTube felt impenetrable.
She spent one weekend learning the 5-layer stack. Her insight was borrowed directly from her journalism career: treat YouTube like a news wire. Post fast, post consistently, prioritize information density over production polish.
Her stack at launch:
- Script: ChatGPT-4o (USD 20/month, PKR 5,600) — she valued brand voice memory
- Voiceover: ElevenLabs Starter (USD 11/month, PKR 3,080) — Aditi voice, Urdu
- Visuals: Pexels + CapCut templates (free)
- Editing: CapCut mobile (free — she edited on her phone during commutes)
- Optimization: TubeBuddy Pro (USD 10/month, PKR 2,800)
- Total spend: USD 41/month = PKR 11,480/month
Channel concept: "Tech Pakistan Daily" — 90-second Urdu summaries of the day's biggest tech news. One story per video. Posted at 8:00 AM every weekday so subscribers could watch during morning tea.
Results after 3 months:
- Subscribers: 120,000
- Total views: 8 million
- YouTube AdSense: PKR 120,000/month
- Brand sponsorships (3 Pakistani tech brands): PKR 80,000/month
- Total: PKR 200,000/month on PKR 11,480/month investment
Her competitive edge was counterintuitive: she deliberately kept production quality at 7/10 to maintain a 2-video-per-day pace. Bigger channels uploaded weekly and over-produced. She uploaded daily and under-produced — but outranked them on recency and volume.
Month 6 move: Patreon subscription (USD 5/month, PKR 1,400/month) offering exclusive 5-minute deep-dive versions of the day's story. Target: 300 subscribers = PKR 420,000/month recurring. Total projected revenue: PKR 600,000+ monthly.
Key Takeaways
- The 5-layer production pipeline is universal — every faceless video requires script, voiceover, visuals, assembly, and distribution, in that order
- Free tools are sufficient for testing but ElevenLabs (PKR 3,080/month) is the single highest-leverage upgrade you can make
- At Tier 1 budget (PKR 3,080/month), you produce 8.5/10 quality videos — professional enough to monetize and grow
- Production time drops from 112 minutes on video 1 to 40 minutes by video 30 as templates and muscle memory compound
- CapCut's Auto Captions feature saves 30 minutes per video and increases audience retention (40% of viewers watch without sound)
- Imagen 4.0 and Runway Gen-3 enable custom visuals when stock footage cannot match your specific narrative
- Pakistan-specific nuance: Urdu voiceover extends local viewer watch duration by approximately 2x compared to English-only content
- TubeBuddy's A/B thumbnail testing is worth USD 10/month after your first 10,000 subscribers — before that, focus on production volume
- The fastest learners time their production sessions and systematically eliminate their single biggest bottleneck each week
- Outsourcing the bottleneck step at PKR 300–500 per video is cost-effective once your channel earns PKR 20,000+/month
Lesson Summary
AI Video Tools Landscape Quiz
4 questions to test your understanding. Score 60% or higher to pass.