AI Video ProductionModule 1

1.2AI Video Tools Landscape — Veo vs Runway vs HeyGen

30 min 3 code blocks Practice Lab Quiz (4Q)

AI Video Tools Landscape

Creating professional faceless videos used to require a team — scriptwriters, voiceover artists, video editors, and animators charging combined rates of PKR 50,000–200,000 per video. Today, one person with a laptop and a modest internet connection can produce studio-quality videos in under 2 hours. The toolkit revolution driving Pakistan's creator economy explosion did not happen gradually — it happened in a single 18-month window between 2024 and 2026. The tools that once cost USD 2,000/month are now free or under USD 100/month. This lesson gives you the complete production stack, tool-by-tool setup instructions, and a benchmarked workflow to get your first video from blank page to uploaded within your first session.

The 5-Layer Production Pipeline

Every faceless video moves through exactly five layers. Understand each layer, know your tool options at each layer, and you will never feel stuck in production.

code
AI VIDEO PRODUCTION PIPELINE (2026)
════════════════════════════════════════════════════════════════

  INPUT: IDEA / NICHE TOPIC
         │
         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  LAYER 1: SCRIPT GENERATION                              │
  │  ├── Gemini 2.5 Flash (free, 15 req/day)                 │
  │  ├── Gemini 2.5 Pro (free tier via AI Studio)            │
  │  ├── ChatGPT-4o (USD 20/month)                           │
  │  └── Claude Sonnet (USD 20/month — best long-form)       │
  └──────────────────────────────────────────────────────────┘
         │
         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  LAYER 2: VOICEOVER SYNTHESIS                            │
  │  ├── ElevenLabs (free: 10k char/month; paid: USD 11/mo)  │
  │  ├── Google Cloud TTS (free: 4M char/month)              │
  │  ├── Descript (USD 24/month — captions included)         │
  │  └── Murf.ai (USD 19/month — 60+ voices)                 │
  └──────────────────────────────────────────────────────────┘
         │
         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  LAYER 3: VISUAL GENERATION                              │
  │  ├── Pexels / Pixabay / Unsplash (free stock)            │
  │  ├── Imagen 4.0 Ultra (via Google AI Studio, free tier)  │
  │  ├── Runway Gen-3 (USD 12/month — text-to-video)         │
  │  └── CapCut template library (free — 50k+ backgrounds)   │
  └──────────────────────────────────────────────────────────┘
         │
         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  LAYER 4: VIDEO ASSEMBLY                                 │
  │  ├── CapCut (free, no watermark, best for beginners)     │
  │  ├── DaVinci Resolve (free, industry-standard grading)   │
  │  ├── HeyGen (USD 29/month — AI avatar auto-animation)    │
  │  └── Adobe Premiere Pro (USD 55/month — professional)    │
  └──────────────────────────────────────────────────────────┘
         │
         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  LAYER 5: DISTRIBUTION & OPTIMIZATION                    │
  │  ├── YouTube Studio (free — primary platform)            │
  │  ├── TubeBuddy (USD 10/month — SEO + A/B thumbnails)     │
  │  ├── Buffer (USD 5/month — schedule TikTok + Instagram)  │
  │  └── vidIQ (free tier — keyword research + analytics)    │
  └──────────────────────────────────────────────────────────┘
         │
         ▼
  OUTPUT: UPLOADED VIDEO → ALGORITHM DISTRIBUTION

════════════════════════════════════════════════════════════════

Tool Comparison: Free vs. Paid (PKR Pricing)

All USD costs converted at PKR 280/USD (2026 rate):

LayerFree OptionQualityPaid OptionCost/monthCost (PKR)Quality Gain
ScriptGemini 2.5 Flash7/10ChatGPT-4oUSD 20PKR 5,600Brand memory, custom tone
ScriptGoogle Gemini7/10Claude SonnetUSD 20PKR 5,600Long-form coherence
VoiceoverGoogle TTS5/10ElevenLabs StarterUSD 11PKR 3,080Human-quality voice
VoiceoverElevenLabs free8/10ElevenLabs CreatorUSD 22PKR 6,160100k more characters
VisualsPexels / Pixabay6/10Runway Gen-3USD 12PKR 3,360Custom AI video clips
VisualsCapCut templates6/10Imagen 4.0 UltraFree tierPKR 0Photorealistic images
AssemblyCapCut free8/10DaVinci ResolveFreePKR 0Color grading upgrade
AssemblyDaVinci Resolve8/10Adobe PremiereUSD 55PKR 15,400Advanced effects only
DistributionYouTube Studio9/10TubeBuddy ProUSD 10PKR 2,800Keyword + thumbnail A/B
DistributionvidIQ free7/10BufferUSD 5PKR 1,400Multi-platform scheduling

Bottom line: For PKR 3,080/month (ElevenLabs Starter only), you upgrade from 6/10 amateur quality to 8.5/10 professional quality. That single upgrade pays for itself with 10,000 additional views on your first video.

Per-Tool Setup Guide

Layer 1: Script Writing with Gemini 2.5 Flash

Setup (2 minutes):

  1. Go to aistudio.google.com — sign in with Google account
  2. Select model: Gemini 2.5 Flash
  3. Set temperature to 0.8 (creative but controlled)
  4. Bookmark the prompt you use most often

Master prompt template for faceless YouTube:

code
You are a YouTube scriptwriter for a faceless educational channel
targeting Pakistani audiences aged 18–35.

Video topic: [YOUR TOPIC]
Video length: [X] minutes
Tone: [motivational / educational / news-style]
Language: [English / Urdu / mixed]

Script requirements:
- Hook in the first 15 seconds (question or shocking stat)
- Clear structure: Hook → Problem → Solution → Takeaways → CTA
- Timestamps every 90 seconds
- Include 3 actionable takeaways viewers can implement today
- End with a specific call to action (subscribe + comment prompt)
- Word count: [LENGTH × 165 words] (for conversational 165 WPM pacing)

Add [PAUSE] markers where the voiceover should breathe between sections.

Layer 2: ElevenLabs Voiceover

Setup (5 minutes):

  1. Go to elevenlabs.io — sign up free (no credit card needed)
  2. Navigate to "Text to Speech" in left sidebar
  3. Browse Voice Library — filter by language: Urdu or English
  4. Select a voice — click "Preview" to test with a sample sentence
  5. For Pakistani content: try "Aditi" (Urdu warm female) or "Brian" (English authoritative male)
  6. Paste your script — click Generate
  7. Download MP3

Free tier limit: 10,000 characters/month (approx. 2 full 6-minute videos). Upgrade to Starter (USD 11/month) for 30,000 characters — about 7 full videos monthly.

Layer 3: Visuals with Pexels + Imagen 4.0

Pexels setup (2 minutes):

  1. Go to pexels.com — no account required for downloads
  2. Search by keyword matching your script sections (e.g., "cryptocurrency trading", "Pakistan Karachi city")
  3. Filter: Videos → HD → Free commercial license
  4. Download 10–15 clips per video — you will use 60–70% of them

Imagen 4.0 setup (5 minutes):

  1. Go to aistudio.google.com
  2. Select "Generate Images" → Model: Imagen 4.0
  3. Prompt format: "[Scene description], cinematic lighting, high resolution, photorealistic, --style [documentary/editorial/dramatic]"
  4. Example: "Karachi skyline at golden hour with blockchain nodes overlaid, cinematic, photorealistic"
  5. Download and use as custom thumbnail or B-roll overlay

Layer 4: CapCut Assembly

Setup (3 minutes):

  1. Download CapCut from capcut.com (Windows/Mac/Mobile — all free, no watermark)
  2. Create new project → select 16:9 ratio for YouTube, 9:16 for Shorts/Reels
  3. Import your voiceover MP3 and all stock footage
  4. Drag voiceover to audio track — it auto-sets your video length
  5. Add footage above audio: each clip should be 3–8 seconds to maintain energy
  6. Enable "Auto Captions" (CapCut reads your audio and adds animated subtitles in 1 click)
  7. Export: 1080p, 30fps, H.264 codec

CapCut's killer feature for faceless channels: The "Auto Reframe" tool auto-crops 16:9 footage into 9:16 for Shorts — one video becomes two posts with one click.

Timing Benchmark: Script to Upload

Professional workflow for a 6-minute faceless video:

code
PRODUCTION TIMING BREAKDOWN
═══════════════════════════════════════════════════

  TASK                          TIME      TOOL
  ─────────────────────────────────────────────
  Generate script               5 min     Gemini 2.5 Flash
  Edit script (pacing/Urdu-isms) 10 min   Manual review
  Generate voiceover            2 min     ElevenLabs
  Download 15 stock footage clips 25 min  Pexels / Pixabay
  Generate 2 custom AI images   5 min     Imagen 4.0
  Assemble in CapCut            35 min    CapCut
  Add auto-captions + review    8 min     CapCut
  Create thumbnail in Canva     7 min     Canva free
  Export video                  5 min     CapCut
  Upload to YouTube + metadata  10 min    YouTube Studio
  ─────────────────────────────────────────────
  TOTAL (Video 1)               112 min   ~1h 52min

  TOTAL (Video 10, with templates) 55 min  ~55 min
  TOTAL (Video 30, full system)    40 min  ~40 min

═══════════════════════════════════════════════════

The efficiency gains compound. By video 10, you have reusable thumbnail templates, a folder of unused stock footage, and muscle memory for CapCut shortcuts. By video 30, the entire process is reflex-level.

Budget Tiers: What You Actually Need to Spend

Tier 0: Zero Budget (Start Here)

ToolCostUse
Gemini 2.5 FlashFreeScript generation
Google TTSFreeVoiceover (robotic quality)
Pexels / PixabayFreeStock footage
CapCutFreeVideo editing + captions
YouTube StudioFreeUpload + analytics
TotalPKR 0Quality: 6/10

Limitation: Google TTS sounds noticeably robotic. Viewers notice. Watch time drops 15–20% vs. ElevenLabs voice. Use Tier 0 only for your first 1–2 test videos.

Tier 1: Minimum Viable Professional (Recommended Start)

ToolUSD/monthPKR/monthUse
ElevenLabs StarterUSD 11PKR 3,080Human-quality voiceover
All other toolsFreePKR 0Script, visuals, editing
TotalUSD 11PKR 3,080Quality: 8.5/10

This is the sweet spot. One tool upgrade transforms your videos from amateur to professional. ElevenLabs alone can increase your average view duration by 20–30% vs. robotic TTS.

Tier 2: Full Professional Stack

ToolUSD/monthPKR/monthUse
ElevenLabs CreatorUSD 22PKR 6,160High-volume voiceover
Runway Gen-3USD 12PKR 3,360Custom AI video clips
TubeBuddy ProUSD 10PKR 2,800YouTube SEO optimization
BufferUSD 5PKR 1,400Multi-platform scheduling
TotalUSD 49PKR 13,720Quality: 9.5/10

ROI break-even: 25,000 views/month (PKR 15,000–25,000 ad revenue). At 30 videos/month with consistent uploads, most creators hit this by month 2.

Practice Lab

Practice Lab

Task 1: Stack Setup Create your personal AI toolkit today. Complete these sign-ups in order: (1) Google AI Studio at aistudio.google.com — test with one script generation, (2) ElevenLabs free account at elevenlabs.io — generate one 100-word voiceover sample, (3) CapCut download at capcut.com — create a blank 6-second test project, (4) Pexels account at pexels.com — download 5 stock footage clips matching your niche. Final test: Combine everything into a 30-second clip. No polishing required — just confirm the pipeline works end to end.

Task 2: Speed Benchmark Time yourself creating one complete video from script to upload. Start your timer when you open Gemini. Stop when you click "Publish" on YouTube. Record where you lose the most time — script editing, footage hunting, or CapCut assembly. This is your personal bottleneck. Every future efficiency gain comes from attacking that specific bottleneck with either better tools or outsourcing.

Task 3: Cost-to-Revenue Calculation Using your niche's estimated CPM from the table earlier, calculate: (a) how many monthly views you need at Tier 1 budget (PKR 3,080/month) to break even, (b) how many videos per month at your target length that requires, and (c) how long at your current production speed before you hit break-even views. This math tells you exactly whether to start free or invest in Tier 1 immediately.

Pakistan Case Study: "Tech Pakistan Daily"

Fatima Zaidi, a former news anchor from Islamabad, lost her television contract in late 2024 when her station cut 40% of its staff to manage rising production costs. She had 12 years of journalism experience, a sharp analytical mind, and zero video editing skills. "Main ne socha tha yeh sab mere liye nahin hai" ("I thought all of this wasn't for me"), she said in a later interview — meaning the technical side of YouTube felt impenetrable.

She spent one weekend learning the 5-layer stack. Her insight was borrowed directly from her journalism career: treat YouTube like a news wire. Post fast, post consistently, prioritize information density over production polish.

Her stack at launch:

  • Script: ChatGPT-4o (USD 20/month, PKR 5,600) — she valued brand voice memory
  • Voiceover: ElevenLabs Starter (USD 11/month, PKR 3,080) — Aditi voice, Urdu
  • Visuals: Pexels + CapCut templates (free)
  • Editing: CapCut mobile (free — she edited on her phone during commutes)
  • Optimization: TubeBuddy Pro (USD 10/month, PKR 2,800)
  • Total spend: USD 41/month = PKR 11,480/month

Channel concept: "Tech Pakistan Daily" — 90-second Urdu summaries of the day's biggest tech news. One story per video. Posted at 8:00 AM every weekday so subscribers could watch during morning tea.

Results after 3 months:

  • Subscribers: 120,000
  • Total views: 8 million
  • YouTube AdSense: PKR 120,000/month
  • Brand sponsorships (3 Pakistani tech brands): PKR 80,000/month
  • Total: PKR 200,000/month on PKR 11,480/month investment

Her competitive edge was counterintuitive: she deliberately kept production quality at 7/10 to maintain a 2-video-per-day pace. Bigger channels uploaded weekly and over-produced. She uploaded daily and under-produced — but outranked them on recency and volume.

Month 6 move: Patreon subscription (USD 5/month, PKR 1,400/month) offering exclusive 5-minute deep-dive versions of the day's story. Target: 300 subscribers = PKR 420,000/month recurring. Total projected revenue: PKR 600,000+ monthly.

Key Takeaways

  • The 5-layer production pipeline is universal — every faceless video requires script, voiceover, visuals, assembly, and distribution, in that order
  • Free tools are sufficient for testing but ElevenLabs (PKR 3,080/month) is the single highest-leverage upgrade you can make
  • At Tier 1 budget (PKR 3,080/month), you produce 8.5/10 quality videos — professional enough to monetize and grow
  • Production time drops from 112 minutes on video 1 to 40 minutes by video 30 as templates and muscle memory compound
  • CapCut's Auto Captions feature saves 30 minutes per video and increases audience retention (40% of viewers watch without sound)
  • Imagen 4.0 and Runway Gen-3 enable custom visuals when stock footage cannot match your specific narrative
  • Pakistan-specific nuance: Urdu voiceover extends local viewer watch duration by approximately 2x compared to English-only content
  • TubeBuddy's A/B thumbnail testing is worth USD 10/month after your first 10,000 subscribers — before that, focus on production volume
  • The fastest learners time their production sessions and systematically eliminate their single biggest bottleneck each week
  • Outsourcing the bottleneck step at PKR 300–500 per video is cost-effective once your channel earns PKR 20,000+/month

Lesson Summary

Includes hands-on practice lab3 runnable code examples4-question knowledge check below

AI Video Tools Landscape Quiz

4 questions to test your understanding. Score 60% or higher to pass.