7.2 — Workflow Monitoring with Slack/Email Notifications
Workflow Monitoring with Slack/Email Notifications
Building workflows is the easy part. Keeping them running reliably across 5, 10, 50 automations — that's the real challenge. You need visibility: which workflows ran, which failed, which are slow, which haven't run when they should have. This lesson teaches you to build a monitoring layer that gives you full visibility into your automation empire.
The Monitoring Stack
┌──────────────────────────────────────────────┐
│ MONITORING LAYERS │
│ │
│ Layer 1: EXECUTION ALERTS │
│ → Know when individual workflows fail │
│ │
│ Layer 2: HEARTBEAT MONITORING │
│ → Know when scheduled workflows DIDN'T run │
│ │
│ Layer 3: PERFORMANCE TRACKING │
│ → Know when workflows are slow or degraded │
│ │
│ Layer 4: DAILY DIGEST │
│ → Summary of all automation activity │
│ │
│ Alert Channels: Slack → Email → WhatsApp │
└──────────────────────────────────────────────┘
Layer 1: Execution Alerts
n8n Error Workflow (Built-In)
n8n has a built-in feature: Error Workflow. This runs automatically when ANY workflow fails.
SETUP:
1. Create a new workflow: "Error Handler — Master Alert"
2. Trigger: Error Trigger node
3. This node receives:
- $json.execution.id — the failed execution ID
- $json.execution.url — direct link to the execution
- $json.workflow.id — which workflow failed
- $json.workflow.name — workflow name
- $json.execution.error.message — what went wrong
- $json.execution.error.node.name — which node failed
4. Connect to your alert channel (Slack, Email, WhatsApp)
5. Assign this workflow as the Error Workflow:
Settings → Error Workflow → Select "Error Handler — Master Alert"
Do this for EVERY production workflow
Error Alert Workflow Design
[Error Trigger]
│
▼
[Set Node: Format alert data]
workflow_name = {{$json.workflow.name}}
error_msg = {{$json.execution.error.message}}
failed_node = {{$json.execution.error.node.name}}
exec_url = {{$json.execution.url}}
timestamp = {{$now.format('YYYY-MM-DD HH:mm:ss')}}
│
▼
[Switch Node: Severity classification]
│
├── Contains "rate limit" or "429" → SEVERITY: WARNING
├── Contains "timeout" → SEVERITY: WARNING
├── Contains "unauthorized" or "401" → SEVERITY: CRITICAL
├── Contains "payment" or "order" → SEVERITY: CRITICAL
└── Default → SEVERITY: ERROR
│
▼
[Slack: Post to #automation-alerts]
│
▼
[IF: CRITICAL severity]
│
├── YES → [Gmail: Email alert]
│ [WhatsApp: Urgent alert to phone]
│
└── NO → [Done — Slack is enough]
Slack Message Format
Slack Block Kit Message:
🔴 *Workflow Failure Alert*
*Workflow:* {{$json.workflow_name}}
*Failed Node:* {{$json.failed_node}}
*Error:* {{$json.error_msg}}
*Severity:* {{$json.severity}}
*Time:* {{$json.timestamp}}
<{{$json.exec_url}}|View Execution in n8n>
Layer 2: Heartbeat Monitoring
Execution alerts tell you when a workflow FAILS. But what about when a workflow DOESN'T RUN AT ALL? A Schedule Trigger that should run every hour but silently stops? This is the silent killer.
The Heartbeat Pattern
CONCEPT:
Every important workflow sends a "heartbeat" after successful completion.
A separate monitor checks if heartbeats arrived on time.
IMPLEMENTATION:
Step 1: Add heartbeat to every critical workflow
[End of workflow] → [Google Sheets: Log heartbeat]
Row: workflow_name | status: "OK" | timestamp
Step 2: Heartbeat Monitor (runs every hour)
[Schedule Trigger: Every 1 hour]
│
▼
[Google Sheets: Read heartbeat log]
│
▼
[Function Node: Check for missing heartbeats]
// Define expected schedules
const schedules = {
"Order Sync": 15, // should run every 15 min
"Daily Report": 1440, // should run every 24 hours
"Inventory Check": 360 // should run every 6 hours
};
const now = new Date();
const missing = [];
for (const item of $input.all()) {
const name = item.json.workflow_name;
const lastRun = new Date(item.json.timestamp);
const expectedInterval = schedules[name];
if (!expectedInterval) continue;
const minutesSince = (now - lastRun) / 60000;
if (minutesSince > expectedInterval * 1.5) {
missing.push({
workflow: name,
last_seen: item.json.timestamp,
minutes_overdue: Math.round(minutesSince - expectedInterval)
});
}
}
return missing.length > 0
? missing.map(m => ({ json: m }))
: [{ json: { status: "all_healthy" } }];
│
▼
[IF: missing workflows found]
│
├── YES → [Slack: Alert — workflows are not running]
└── NO → [Done — all healthy]
Layer 3: Performance Tracking
Execution Time Monitoring
Track how long each workflow takes:
[Start of workflow]
│
▼
[Set: start_time = {{$now.toMillis()}}]
│
▼
[... your workflow nodes ...]
│
▼
[Set: end_time = {{$now.toMillis()}}]
[Set: duration_ms = {{$json.end_time - $json.start_time}}]
[Set: duration_sec = {{Math.round($json.duration_ms / 1000)}}]
│
▼
[Google Sheets: Log performance]
Columns: workflow_name | duration_sec | items_processed | timestamp
│
▼
[IF: duration_sec > threshold (e.g., 120 seconds)]
│
├── YES → [Slack: Performance warning — workflow took {{duration_sec}}s]
└── NO → [Done]
Items Processed Tracking
For batch workflows, track throughput:
[After batch processing]
│
▼
[Set Node]
items_total = {{$json.items.length}}
items_success = {{$json.items.filter(i => !i.error).length}}
items_failed = {{$json.items.filter(i => i.error).length}}
success_rate = {{Math.round(items_success / items_total * 100)}}%
│
▼
[IF: success_rate < 90%]
│
├── YES → [Alert: High failure rate in batch — {{success_rate}}%]
└── NO → [Log to performance sheet]
Layer 4: Daily Digest
The Morning Report
A single daily summary of all automation activity:
[Schedule Trigger: Daily at 9 AM PKT]
│
▼
[Google Sheets: Read last 24h of heartbeat logs]
│
▼
[Google Sheets: Read last 24h of performance logs]
│
▼
[Function Node: Compile digest]
const heartbeats = $input.first().json.heartbeats;
const performance = $input.first().json.performance;
const totalRuns = heartbeats.length;
const failures = heartbeats.filter(h => h.status === "FAILED").length;
const avgDuration = performance.reduce((a, b) =>
a + b.duration_sec, 0) / performance.length;
return [{
json: {
total_executions: totalRuns,
failures: failures,
success_rate: Math.round((totalRuns - failures) / totalRuns * 100),
avg_duration_sec: Math.round(avgDuration),
slowest: performance.sort((a, b) =>
b.duration_sec - a.duration_sec)[0]
}
}];
│
▼
[Slack/Email: Send daily digest]
Daily Digest Format
📊 *Automation Daily Digest — {{$now.format('YYYY-MM-DD')}}*
*Overall Health:* ✅ Healthy / ⚠️ Degraded / 🔴 Issues
*Executions:* {{total}} runs ({{success_rate}}% success)
*Failures:* {{failures}} ({{failure_details}})
*Avg Duration:* {{avg_duration}}s
*Slowest:* "{{slowest_name}}" took {{slowest_duration}}s
*Workflows Summary:*
✅ Order Sync — 96 runs, 0 failures
✅ Daily Report — 1 run, OK
⚠️ Inventory Check — 4 runs, 1 timeout (retried OK)
✅ Lead Nurture — 24 runs, 0 failures
*Action Items:*
- None / [List any manual interventions needed]
Setting Up Slack for n8n
Quick Slack Setup
1. Go to api.slack.com/apps → Create New App
2. From Scratch → Name: "n8n Alerts" → Select workspace
3. OAuth & Permissions → Add scopes:
- chat:write
- chat:write.public
4. Install to Workspace → Copy Bot Token (xoxb-...)
5. In n8n: Credentials → Add → Slack API
→ Paste the Bot Token
6. Create a #automation-alerts channel in Slack
7. Invite the bot: /invite @n8n Alerts
Alternative: WhatsApp Alerts (Pakistan-Friendly)
If your clients don't use Slack, use WhatsApp via WATI:
[HTTP Request: WATI API]
Method: POST
URL: https://live-server.wati.io/api/v1/sendTemplateMessage
Headers:
Authorization: Bearer {{$env.WATI_API_KEY}}
Body:
{
"template_name": "automation_alert",
"broadcast_name": "n8n_alert",
"receivers": [
{
"whatsappNumber": "923001234567",
"customParams": [
{ "name": "workflow", "value": "{{$json.workflow_name}}" },
{ "name": "error", "value": "{{$json.error_msg}}" }
]
}
]
}
Practice Lab
Task 1: Set Up Error Workflow Create the Master Error Handler workflow. Configure it to send alerts to your email (or Slack if you have it). Assign it as the Error Workflow for at least 2 of your existing workflows. Test by intentionally breaking a node.
Task 2: Build a Heartbeat Monitor Add heartbeat logging to 3 workflows (or create 3 simple test workflows with Schedule Triggers). Build the heartbeat monitor that checks if all heartbeats arrived on time. Test by disabling one workflow and verifying the alert fires.
Task 3: Create a Daily Digest Build a daily digest workflow that summarizes the last 24 hours of workflow activity. Even if you only have 2-3 workflows, the pattern is the same. Send the digest to your email every morning.
Pakistan Case Study
Meet Saad — manages automation for a digital agency in Lahore serving 8 client accounts.
His monitoring problem: Running 23 workflows across 8 clients. No monitoring. Found out about failures from angry client calls. Spent Monday mornings firefighting weekend failures.
His monitoring implementation:
Phase 1 (Week 1): Error Workflow
- Built Master Error Handler with Slack + WhatsApp alerts
- Assigned to all 23 workflows
- Immediate result: caught 3 failures on Day 1 he wouldn't have noticed
Phase 2 (Week 2): Heartbeat Monitor
- Added heartbeat to 12 critical workflows (order sync, lead nurture, reporting)
- Monitor checks every 30 minutes
- Caught a "silent death" on Day 3 — a client's Shopify webhook workflow had stopped receiving triggers (Shopify webhook had unregistered itself)
Phase 3 (Week 3): Daily Digest
- Morning report at 9 AM to Slack + email
- Shares weekly summary with clients every Friday
- Clients love seeing "Your automation ran 312 times this week with 99.7% success rate"
Phase 4 (Week 4): Performance Tracking
- Added execution time logging
- Discovered one workflow was gradually slowing down (30s → 90s → 180s over 2 weeks)
- Root cause: Google Sheets was getting too large — added archival workflow
Results after 1 month of monitoring:
- Undetected failures: 4-6/week → 0
- Client complaints about automation: 2-3/week → 0
- Monday firefighting: 3-4 hours → 15 minutes
- Proactive fixes (caught before client noticed): 8 incidents in first month
- Added "Monitoring & SLA" as a line item on his proposals: +PKR 15,000/month per client
- Total monitoring revenue: PKR 120,000/month (8 clients × PKR 15,000)
His pitch to clients: "Your automations run 24/7. Without monitoring, you only find out about problems when customers complain. With monitoring, I catch problems in 2 minutes — before anyone notices."
Key Takeaways
- Four monitoring layers: execution alerts, heartbeat, performance, daily digest
- n8n's Error Workflow feature catches failures — assign it to EVERY production workflow
- Heartbeat monitoring catches silent deaths (workflows that stop running without errors)
- Performance tracking catches degradation before it becomes failure
- Daily digests give you a single-glance view of your automation health
- Slack is ideal for technical alerts; WhatsApp (WATI) for client-facing alerts
- Monitoring is a sellable service — charge PKR 10,000-15,000/month per client
- Classify alerts by severity: Critical (immediate), Warning (1 hour), Info (no action)
- The morning digest alone saves hours of manual checking across multiple clients
Next lesson: Debugging complex workflows — a systematic approach to finding and fixing problems fast.
Lesson Summary
Quiz: Workflow Monitoring with Slack/Email Notifications
4 questions to test your understanding. Score 60% or higher to pass.