AI Infrastructure & Local LLMsModule 8

8.1Cloud Cost Analysis — AWS vs. GCP vs. Local for Pakistan

30 min 3 code blocks Practice Lab Quiz (4Q)

Cloud Cost Analysis — AWS vs. GCP vs. Local for Pakistan

Running AI workloads in the cloud is easy. Running them without going bankrupt is hard. A single A100 GPU on AWS costs ~$3.50/hour — leave it running 24/7 and that's $2,520/month (PKR 700,000+). For a Pakistani startup, that's an entire team's salary. This lesson breaks down the real costs of running AI infrastructure on AWS, GCP, and local hardware — with PKR calculations specific to Pakistan.

The Three Options

code
┌──────────────────────────────────────────────────────────┐
│  OPTION 1: Cloud (AWS/GCP/Azure)                         │
│  ├── No upfront cost, pay per hour                       │
│  ├── Scale instantly                                     │
│  ├── Expensive at sustained usage                        │
│  └── Best for: variable traffic, enterprise clients      │
├──────────────────────────────────────────────────────────┤
│  OPTION 2: Dedicated GPU Server (Hetzner/OVH/Vultr)     │
│  ├── Fixed monthly cost                                  │
│  ├── 60-80% cheaper than cloud at sustained usage        │
│  ├── Limited scaling                                     │
│  └── Best for: steady workloads, bootstrapped startups   │
├──────────────────────────────────────────────────────────┤
│  OPTION 3: Local Hardware (Home/Office Server)           │
│  ├── High upfront cost, low ongoing                      │
│  ├── No recurring GPU fees                               │
│  ├── Pakistan power/cooling considerations               │
│  └── Best for: development, low-traffic production       │
└──────────────────────────────────────────────────────────┘

Cloud Cost Comparison (GPU Instances)

AWS EC2 GPU Instances

InstanceGPUVRAMOn-Demand/hrMonthly (24/7)Monthly (PKR)
g4dn.xlargeT416 GB$0.526$379PKR 106,000
g5.xlargeA10G24 GB$1.006$724PKR 203,000
p4d.24xlarge8x A100320 GB$32.77$23,594PKR 6.6M
g6.xlargeL424 GB$0.805$580PKR 162,000

GCP Compute Engine

InstanceGPUVRAMOn-Demand/hrMonthly (24/7)Monthly (PKR)
n1 + T4T416 GB$0.35$252PKR 70,500
g2-standard-4L424 GB$0.74$533PKR 149,000
a2-highgpu-1gA100 40GB40 GB$3.67$2,642PKR 740,000

Dedicated GPU Servers (Fixed Monthly)

ProviderGPUVRAMMonthlyMonthly (PKR)
Hetzner (GEX44)RTX 409024 GB€130PKR 40,000
OVH (GPU1)A1616 GB€120PKR 37,000
VultrA10080 GB$3,274PKR 916,000
Lambda LabsA10080 GB$1.10/hr (~$792/mo)PKR 222,000
Vast.aiRTX 409024 GB$0.25/hr ($180/mo)PKR 50,000

Cost Summary Table

For running a 7B model 24/7 (needs ~16GB VRAM):

OptionMonthly CostPKR/monthNotes
AWS T4 On-Demand$379106,000Most expensive
GCP T4 On-Demand$25270,50033% cheaper than AWS
GCP T4 Committed$15142,3001-year commitment
Hetzner RTX 4090€13040,000Best value for sustained
Vast.ai RTX 4090~$18050,000Community GPUs
Local RTX 4060 Ti$0 (already owned)3,000 (electricity)Cheapest long-term

Pakistan-Specific Considerations

Electricity Costs

SetupPower DrawMonthly kWhCost (PKR 35/kWh)
RTX 4060 Ti server200W avg144 kWhPKR 5,040
RTX 4090 server350W avg252 kWhPKR 8,820
Dual GPU workstation600W avg432 kWhPKR 15,120

Note: Pakistan electricity rates vary by slab. Industrial rates can be PKR 25-45/kWh depending on your tariff category. UPS/generator backup adds PKR 5,000-15,000/month.

Internet Bandwidth

Cloud APIs need reliable internet. Pakistan considerations:

  • PTCL Fiber: 100 Mbps for PKR 5,000-8,000/month — adequate for most API traffic
  • StormFiber/Nayatel: More reliable, PKR 4,000-7,000/month
  • Latency to cloud: Pakistan → AWS Mumbai (ap-south-1): 50-80ms. Pakistan → AWS Frankfurt: 120-180ms
  • Recommendation: Use Mumbai/Singapore regions for lowest latency from Pakistan

UPS and Power Backup

For local servers in Pakistan, power outages are real:

  • Online UPS (1.5 KVA): PKR 40,000-60,000 — protects against brief outages
  • Generator backup: PKR 15,000-30,000/month fuel for frequent outages
  • Solar + battery: PKR 300,000-500,000 upfront, but near-zero running cost after

The Break-Even Analysis

When Does Local Beat Cloud?

code
Monthly cloud cost: PKR 70,000 (GCP T4)

Local server cost:
├── RTX 4060 Ti: PKR 120,000 (one-time)
├── Server hardware: PKR 100,000 (one-time)
├── UPS: PKR 50,000 (one-time)
├── Total upfront: PKR 270,000
├── Monthly running: PKR 8,000 (electricity + internet)
└── Break-even: 270,000 / (70,000 - 8,000) = 4.4 months

After 4.4 months, local saves PKR 62,000/month
After 1 year: PKR 62,000 × 7.6 = PKR 471,200 saved
After 2 years: PKR 62,000 × 19.6 = PKR 1,215,200 saved

But local has risks:

  • Hardware failure (no auto-replacement)
  • No auto-scaling (can't handle traffic spikes)
  • Pakistan power reliability
  • You're the sysadmin

The Hybrid Approach (Recommended)

code
Development: Local GPU (PKR 5,000/month)
    ↓
Staging/Testing: Vast.ai or spot instances (PKR 10,000/month)
    ↓
Production (steady): Hetzner dedicated (PKR 40,000/month)
    ↓
Traffic spikes: Cloud auto-scale (pay-per-use only during peaks)

Decision Framework

FactorChoose CloudChoose DedicatedChoose Local
Traffic patternVariable/spikySteadyLow/dev
Budget$$$ available$$ moderate$ tight
Scale needsMust auto-scalePredictableFixed
Uptime requirement99.99% SLA99.9%99% okay
Team skillsDevOps teamSome Linux knowledgeStrong sysadmin
Data sensitivityStandardHigh (your servers)Maximum (air-gapped)
Practice Lab

Practice Lab

Task 1: Cost Calculator Build a spreadsheet comparing your specific AI workload across AWS, GCP, Hetzner, and local hardware. Include: GPU cost, storage, bandwidth, and Pakistan-specific costs (electricity, UPS, internet).

Task 2: Break-Even Analysis Calculate the break-even point for buying an RTX 4060 Ti server vs. renting a cloud GPU for your workload. Factor in electricity at your local rate and UPS costs.

Task 3: Hybrid Architecture Design Design a hybrid infrastructure for a Pakistani AI startup: local for dev, dedicated for production, cloud for spikes. Draw the architecture and calculate total monthly cost.

Pakistan Case Study

Meet Bilal — founder of a Lahore NLP startup offering Urdu-to-English translation API.

His cost evolution:

StageInfrastructureMonthly Cost
Month 1-3AWS p3.2xlarge (V100)PKR 210,000
Month 4-6Switched to GCP with committed usePKR 126,000
Month 7-9Moved to Hetzner dedicated (RTX 4090)PKR 40,000
Month 10+Hetzner + cloud burst for peakPKR 55,000 avg

Total saved in Year 1: PKR 980,000 vs. staying on AWS.

His decision process: "We ran on AWS for 3 months because that's what every tutorial said. Then I did the math — we were spending PKR 210,000/month for a workload that used 60% GPU on average. Hetzner gives us a better GPU for PKR 40,000. For the 3 days a month when traffic spikes, we burst to GCP spot instances."

Key Takeaways

  • AWS is the most expensive option for sustained AI workloads — use it for spikes, not baselines
  • GCP with committed use discounts is 40% cheaper than AWS on-demand
  • Hetzner/OVH dedicated servers are 60-80% cheaper than cloud for steady workloads
  • Local hardware breaks even in 4-5 months vs. cloud — but adds operational risk
  • Pakistan factors: electricity (PKR 35/kWh), UPS necessity, Mumbai region for lowest latency
  • The hybrid approach wins: local dev → dedicated production → cloud burst for spikes
  • Always do the math before committing — most Pakistani startups overspend on cloud

Next lesson: Spot instances, preemptible VMs, and budget strategies for GPU workloads.

Lesson Summary

Includes hands-on practice lab3 runnable code examples4-question knowledge check below

Quiz: Cloud Cost Analysis — AWS vs. GCP vs. Local for Pakistan

4 questions to test your understanding. Score 60% or higher to pass.