n8n Masterclass IModule 7

7.2Workflow Monitoring with Slack/Email Notifications

20 min 11 code blocks Practice Lab Quiz (4Q)

Workflow Monitoring with Slack/Email Notifications

Building workflows is the easy part. Keeping them running reliably across 5, 10, 50 automations — that's the real challenge. You need visibility: which workflows ran, which failed, which are slow, which haven't run when they should have. This lesson teaches you to build a monitoring layer that gives you full visibility into your automation empire.

The Monitoring Stack

code
┌──────────────────────────────────────────────┐
│              MONITORING LAYERS                │
│                                               │
│  Layer 1: EXECUTION ALERTS                    │
│  → Know when individual workflows fail        │
│                                               │
│  Layer 2: HEARTBEAT MONITORING                │
│  → Know when scheduled workflows DIDN'T run   │
│                                               │
│  Layer 3: PERFORMANCE TRACKING                │
│  → Know when workflows are slow or degraded   │
│                                               │
│  Layer 4: DAILY DIGEST                        │
│  → Summary of all automation activity         │
│                                               │
│  Alert Channels: Slack → Email → WhatsApp     │
└──────────────────────────────────────────────┘

Layer 1: Execution Alerts

n8n Error Workflow (Built-In)

n8n has a built-in feature: Error Workflow. This runs automatically when ANY workflow fails.

code
SETUP:

1. Create a new workflow: "Error Handler — Master Alert"
2. Trigger: Error Trigger node
3. This node receives:
   - $json.execution.id — the failed execution ID
   - $json.execution.url — direct link to the execution
   - $json.workflow.id — which workflow failed
   - $json.workflow.name — workflow name
   - $json.execution.error.message — what went wrong
   - $json.execution.error.node.name — which node failed

4. Connect to your alert channel (Slack, Email, WhatsApp)

5. Assign this workflow as the Error Workflow:
   Settings → Error Workflow → Select "Error Handler — Master Alert"
   Do this for EVERY production workflow

Error Alert Workflow Design

code
[Error Trigger]
    │
    ▼
[Set Node: Format alert data]
    workflow_name = {{$json.workflow.name}}
    error_msg = {{$json.execution.error.message}}
    failed_node = {{$json.execution.error.node.name}}
    exec_url = {{$json.execution.url}}
    timestamp = {{$now.format('YYYY-MM-DD HH:mm:ss')}}
    │
    ▼
[Switch Node: Severity classification]
    │
    ├── Contains "rate limit" or "429" → SEVERITY: WARNING
    ├── Contains "timeout" → SEVERITY: WARNING
    ├── Contains "unauthorized" or "401" → SEVERITY: CRITICAL
    ├── Contains "payment" or "order" → SEVERITY: CRITICAL
    └── Default → SEVERITY: ERROR
    │
    ▼
[Slack: Post to #automation-alerts]
    │
    ▼
[IF: CRITICAL severity]
    │
    ├── YES → [Gmail: Email alert]
    │         [WhatsApp: Urgent alert to phone]
    │
    └── NO → [Done — Slack is enough]

Slack Message Format

code
Slack Block Kit Message:

🔴 *Workflow Failure Alert*

*Workflow:* {{$json.workflow_name}}
*Failed Node:* {{$json.failed_node}}
*Error:* {{$json.error_msg}}
*Severity:* {{$json.severity}}
*Time:* {{$json.timestamp}}

<{{$json.exec_url}}|View Execution in n8n>

Layer 2: Heartbeat Monitoring

Execution alerts tell you when a workflow FAILS. But what about when a workflow DOESN'T RUN AT ALL? A Schedule Trigger that should run every hour but silently stops? This is the silent killer.

The Heartbeat Pattern

code
CONCEPT:
Every important workflow sends a "heartbeat" after successful completion.
A separate monitor checks if heartbeats arrived on time.

IMPLEMENTATION:

Step 1: Add heartbeat to every critical workflow
[End of workflow] → [Google Sheets: Log heartbeat]
    Row: workflow_name | status: "OK" | timestamp

Step 2: Heartbeat Monitor (runs every hour)
[Schedule Trigger: Every 1 hour]
    │
    ▼
[Google Sheets: Read heartbeat log]
    │
    ▼
[Function Node: Check for missing heartbeats]

    // Define expected schedules
    const schedules = {
      "Order Sync": 15,      // should run every 15 min
      "Daily Report": 1440,  // should run every 24 hours
      "Inventory Check": 360  // should run every 6 hours
    };

    const now = new Date();
    const missing = [];

    for (const item of $input.all()) {
      const name = item.json.workflow_name;
      const lastRun = new Date(item.json.timestamp);
      const expectedInterval = schedules[name];

      if (!expectedInterval) continue;

      const minutesSince = (now - lastRun) / 60000;
      if (minutesSince > expectedInterval * 1.5) {
        missing.push({
          workflow: name,
          last_seen: item.json.timestamp,
          minutes_overdue: Math.round(minutesSince - expectedInterval)
        });
      }
    }

    return missing.length > 0
      ? missing.map(m => ({ json: m }))
      : [{ json: { status: "all_healthy" } }];
    │
    ▼
[IF: missing workflows found]
    │
    ├── YES → [Slack: Alert — workflows are not running]
    └── NO  → [Done — all healthy]

Layer 3: Performance Tracking

Execution Time Monitoring

code
Track how long each workflow takes:

[Start of workflow]
    │
    ▼
[Set: start_time = {{$now.toMillis()}}]
    │
    ▼
[... your workflow nodes ...]
    │
    ▼
[Set: end_time = {{$now.toMillis()}}]
[Set: duration_ms = {{$json.end_time - $json.start_time}}]
[Set: duration_sec = {{Math.round($json.duration_ms / 1000)}}]
    │
    ▼
[Google Sheets: Log performance]
    Columns: workflow_name | duration_sec | items_processed | timestamp
    │
    ▼
[IF: duration_sec > threshold (e.g., 120 seconds)]
    │
    ├── YES → [Slack: Performance warning — workflow took {{duration_sec}}s]
    └── NO  → [Done]

Items Processed Tracking

code
For batch workflows, track throughput:

[After batch processing]
    │
    ▼
[Set Node]
    items_total = {{$json.items.length}}
    items_success = {{$json.items.filter(i => !i.error).length}}
    items_failed = {{$json.items.filter(i => i.error).length}}
    success_rate = {{Math.round(items_success / items_total * 100)}}%
    │
    ▼
[IF: success_rate < 90%]
    │
    ├── YES → [Alert: High failure rate in batch — {{success_rate}}%]
    └── NO  → [Log to performance sheet]

Layer 4: Daily Digest

The Morning Report

A single daily summary of all automation activity:

code
[Schedule Trigger: Daily at 9 AM PKT]
    │
    ▼
[Google Sheets: Read last 24h of heartbeat logs]
    │
    ▼
[Google Sheets: Read last 24h of performance logs]
    │
    ▼
[Function Node: Compile digest]

    const heartbeats = $input.first().json.heartbeats;
    const performance = $input.first().json.performance;

    const totalRuns = heartbeats.length;
    const failures = heartbeats.filter(h => h.status === "FAILED").length;
    const avgDuration = performance.reduce((a, b) =>
      a + b.duration_sec, 0) / performance.length;

    return [{
      json: {
        total_executions: totalRuns,
        failures: failures,
        success_rate: Math.round((totalRuns - failures) / totalRuns * 100),
        avg_duration_sec: Math.round(avgDuration),
        slowest: performance.sort((a, b) =>
          b.duration_sec - a.duration_sec)[0]
      }
    }];
    │
    ▼
[Slack/Email: Send daily digest]

Daily Digest Format

code
📊 *Automation Daily Digest — {{$now.format('YYYY-MM-DD')}}*

*Overall Health:* ✅ Healthy / ⚠️ Degraded / 🔴 Issues

*Executions:* {{total}} runs ({{success_rate}}% success)
*Failures:* {{failures}} ({{failure_details}})
*Avg Duration:* {{avg_duration}}s
*Slowest:* "{{slowest_name}}" took {{slowest_duration}}s

*Workflows Summary:*
✅ Order Sync — 96 runs, 0 failures
✅ Daily Report — 1 run, OK
⚠️ Inventory Check — 4 runs, 1 timeout (retried OK)
✅ Lead Nurture — 24 runs, 0 failures

*Action Items:*
- None / [List any manual interventions needed]

Setting Up Slack for n8n

Quick Slack Setup

code
1. Go to api.slack.com/apps → Create New App
2. From Scratch → Name: "n8n Alerts" → Select workspace
3. OAuth & Permissions → Add scopes:
   - chat:write
   - chat:write.public
4. Install to Workspace → Copy Bot Token (xoxb-...)
5. In n8n: Credentials → Add → Slack API
   → Paste the Bot Token
6. Create a #automation-alerts channel in Slack
7. Invite the bot: /invite @n8n Alerts

Alternative: WhatsApp Alerts (Pakistan-Friendly)

If your clients don't use Slack, use WhatsApp via WATI:

code
[HTTP Request: WATI API]
    Method: POST
    URL: https://live-server.wati.io/api/v1/sendTemplateMessage
    Headers:
      Authorization: Bearer {{$env.WATI_API_KEY}}
    Body:
      {
        "template_name": "automation_alert",
        "broadcast_name": "n8n_alert",
        "receivers": [
          {
            "whatsappNumber": "923001234567",
            "customParams": [
              { "name": "workflow", "value": "{{$json.workflow_name}}" },
              { "name": "error", "value": "{{$json.error_msg}}" }
            ]
          }
        ]
      }
Practice Lab

Practice Lab

Task 1: Set Up Error Workflow Create the Master Error Handler workflow. Configure it to send alerts to your email (or Slack if you have it). Assign it as the Error Workflow for at least 2 of your existing workflows. Test by intentionally breaking a node.

Task 2: Build a Heartbeat Monitor Add heartbeat logging to 3 workflows (or create 3 simple test workflows with Schedule Triggers). Build the heartbeat monitor that checks if all heartbeats arrived on time. Test by disabling one workflow and verifying the alert fires.

Task 3: Create a Daily Digest Build a daily digest workflow that summarizes the last 24 hours of workflow activity. Even if you only have 2-3 workflows, the pattern is the same. Send the digest to your email every morning.

Pakistan Case Study

Meet Saad — manages automation for a digital agency in Lahore serving 8 client accounts.

His monitoring problem: Running 23 workflows across 8 clients. No monitoring. Found out about failures from angry client calls. Spent Monday mornings firefighting weekend failures.

His monitoring implementation:

Phase 1 (Week 1): Error Workflow

  • Built Master Error Handler with Slack + WhatsApp alerts
  • Assigned to all 23 workflows
  • Immediate result: caught 3 failures on Day 1 he wouldn't have noticed

Phase 2 (Week 2): Heartbeat Monitor

  • Added heartbeat to 12 critical workflows (order sync, lead nurture, reporting)
  • Monitor checks every 30 minutes
  • Caught a "silent death" on Day 3 — a client's Shopify webhook workflow had stopped receiving triggers (Shopify webhook had unregistered itself)

Phase 3 (Week 3): Daily Digest

  • Morning report at 9 AM to Slack + email
  • Shares weekly summary with clients every Friday
  • Clients love seeing "Your automation ran 312 times this week with 99.7% success rate"

Phase 4 (Week 4): Performance Tracking

  • Added execution time logging
  • Discovered one workflow was gradually slowing down (30s → 90s → 180s over 2 weeks)
  • Root cause: Google Sheets was getting too large — added archival workflow

Results after 1 month of monitoring:

  • Undetected failures: 4-6/week → 0
  • Client complaints about automation: 2-3/week → 0
  • Monday firefighting: 3-4 hours → 15 minutes
  • Proactive fixes (caught before client noticed): 8 incidents in first month
  • Added "Monitoring & SLA" as a line item on his proposals: +PKR 15,000/month per client
  • Total monitoring revenue: PKR 120,000/month (8 clients × PKR 15,000)

His pitch to clients: "Your automations run 24/7. Without monitoring, you only find out about problems when customers complain. With monitoring, I catch problems in 2 minutes — before anyone notices."

Key Takeaways

  • Four monitoring layers: execution alerts, heartbeat, performance, daily digest
  • n8n's Error Workflow feature catches failures — assign it to EVERY production workflow
  • Heartbeat monitoring catches silent deaths (workflows that stop running without errors)
  • Performance tracking catches degradation before it becomes failure
  • Daily digests give you a single-glance view of your automation health
  • Slack is ideal for technical alerts; WhatsApp (WATI) for client-facing alerts
  • Monitoring is a sellable service — charge PKR 10,000-15,000/month per client
  • Classify alerts by severity: Critical (immediate), Warning (1 hour), Info (no action)
  • The morning digest alone saves hours of manual checking across multiple clients

Next lesson: Debugging complex workflows — a systematic approach to finding and fixing problems fast.

Lesson Summary

Includes hands-on practice lab11 runnable code examples4-question knowledge check below

Quiz: Workflow Monitoring with Slack/Email Notifications

4 questions to test your understanding. Score 60% or higher to pass.