Workflow Monitoring with Slack/Email Notifications

Building workflows is the easy part. Keeping them running reliably across 5, 10, 50 automations — that's the real challenge. You need visibility: which workflows ran, which failed, which are slow, which haven't run when they should have. This lesson teaches you to build a monitoring layer that gives you full visibility into your automation empire.

The Monitoring Stack

code

┌──────────────────────────────────────────────┐
│              MONITORING LAYERS                │
│                                               │
│  Layer 1: EXECUTION ALERTS                    │
│  → Know when individual workflows fail        │
│                                               │
│  Layer 2: HEARTBEAT MONITORING                │
│  → Know when scheduled workflows DIDN'T run   │
│                                               │
│  Layer 3: PERFORMANCE TRACKING                │
│  → Know when workflows are slow or degraded   │
│                                               │
│  Layer 4: DAILY DIGEST                        │
│  → Summary of all automation activity         │
│                                               │
│  Alert Channels: Slack → Email → WhatsApp     │
└──────────────────────────────────────────────┘

Layer 1: Execution Alerts

n8n Error Workflow (Built-In)

n8n has a built-in feature: Error Workflow. This runs automatically when ANY workflow fails.

code

SETUP:

1. Create a new workflow: "Error Handler — Master Alert"
2. Trigger: Error Trigger node
3. This node receives:
   - $json.execution.id — the failed execution ID
   - $json.execution.url — direct link to the execution
   - $json.workflow.id — which workflow failed
   - $json.workflow.name — workflow name
   - $json.execution.error.message — what went wrong
   - $json.execution.error.node.name — which node failed

4. Connect to your alert channel (Slack, Email, WhatsApp)

5. Assign this workflow as the Error Workflow:
   Settings → Error Workflow → Select "Error Handler — Master Alert"
   Do this for EVERY production workflow

Error Alert Workflow Design

code

[Error Trigger]
    │
    ▼
[Set Node: Format alert data]
    workflow_name = {{$json.workflow.name}}
    error_msg = {{$json.execution.error.message}}
    failed_node = {{$json.execution.error.node.name}}
    exec_url = {{$json.execution.url}}
    timestamp = {{$now.format('YYYY-MM-DD HH:mm:ss')}}
    │
    ▼
[Switch Node: Severity classification]
    │
    ├── Contains "rate limit" or "429" → SEVERITY: WARNING
    ├── Contains "timeout" → SEVERITY: WARNING
    ├── Contains "unauthorized" or "401" → SEVERITY: CRITICAL
    ├── Contains "payment" or "order" → SEVERITY: CRITICAL
    └── Default → SEVERITY: ERROR
    │
    ▼
[Slack: Post to #automation-alerts]
    │
    ▼
[IF: CRITICAL severity]
    │
    ├── YES → [Gmail: Email alert]
    │         [WhatsApp: Urgent alert to phone]
    │
    └── NO → [Done — Slack is enough]

Slack Message Format

code

Slack Block Kit Message:

🔴 *Workflow Failure Alert*

*Workflow:* {{$json.workflow_name}}
*Failed Node:* {{$json.failed_node}}
*Error:* {{$json.error_msg}}
*Severity:* {{$json.severity}}
*Time:* {{$json.timestamp}}

<{{$json.exec_url}}|View Execution in n8n>

Layer 2: Heartbeat Monitoring

Execution alerts tell you when a workflow FAILS. But what about when a workflow DOESN'T RUN AT ALL? A Schedule Trigger that should run every hour but silently stops? This is the silent killer.

The Heartbeat Pattern

code

CONCEPT:
Every important workflow sends a "heartbeat" after successful completion.
A separate monitor checks if heartbeats arrived on time.

IMPLEMENTATION:

Step 1: Add heartbeat to every critical workflow
[End of workflow] → [Google Sheets: Log heartbeat]
    Row: workflow_name | status: "OK" | timestamp

Step 2: Heartbeat Monitor (runs every hour)
[Schedule Trigger: Every 1 hour]
    │
    ▼
[Google Sheets: Read heartbeat log]
    │
    ▼
[Function Node: Check for missing heartbeats]

    // Define expected schedules
    const schedules = {
      "Order Sync": 15,      // should run every 15 min
      "Daily Report": 1440,  // should run every 24 hours
      "Inventory Check": 360  // should run every 6 hours
    };

    const now = new Date();
    const missing = [];

    for (const item of $input.all()) {
      const name = item.json.workflow_name;
      const lastRun = new Date(item.json.timestamp);
      const expectedInterval = schedules[name];

      if (!expectedInterval) continue;

      const minutesSince = (now - lastRun) / 60000;
      if (minutesSince > expectedInterval * 1.5) {
        missing.push({
          workflow: name,
          last_seen: item.json.timestamp,
          minutes_overdue: Math.round(minutesSince - expectedInterval)
        });
      }
    }

    return missing.length > 0
      ? missing.map(m => ({ json: m }))
      : [{ json: { status: "all_healthy" } }];
    │
    ▼
[IF: missing workflows found]
    │
    ├── YES → [Slack: Alert — workflows are not running]
    └── NO  → [Done — all healthy]

Layer 3: Performance Tracking

Execution Time Monitoring

code

Track how long each workflow takes:

[Start of workflow]
    │
    ▼
[Set: start_time = {{$now.toMillis()}}]
    │
    ▼
[... your workflow nodes ...]
    │
    ▼
[Set: end_time = {{$now.toMillis()}}]
[Set: duration_ms = {{$json.end_time - $json.start_time}}]
[Set: duration_sec = {{Math.round($json.duration_ms / 1000)}}]
    │
    ▼
[Google Sheets: Log performance]
    Columns: workflow_name | duration_sec | items_processed | timestamp
    │
    ▼
[IF: duration_sec > threshold (e.g., 120 seconds)]
    │
    ├── YES → [Slack: Performance warning — workflow took {{duration_sec}}s]
    └── NO  → [Done]

Items Processed Tracking

code

For batch workflows, track throughput:

[After batch processing]
    │
    ▼
[Set Node]
    items_total = {{$json.items.length}}
    items_success = {{$json.items.filter(i => !i.error).length}}
    items_failed = {{$json.items.filter(i => i.error).length}}
    success_rate = {{Math.round(items_success / items_total * 100)}}%
    │
    ▼
[IF: success_rate < 90%]
    │
    ├── YES → [Alert: High failure rate in batch — {{success_rate}}%]
    └── NO  → [Log to performance sheet]

Layer 4: Daily Digest

The Morning Report

A single daily summary of all automation activity:

code

[Schedule Trigger: Daily at 9 AM PKT]
    │
    ▼
[Google Sheets: Read last 24h of heartbeat logs]
    │
    ▼
[Google Sheets: Read last 24h of performance logs]
    │
    ▼
[Function Node: Compile digest]

    const heartbeats = $input.first().json.heartbeats;
    const performance = $input.first().json.performance;

    const totalRuns = heartbeats.length;
    const failures = heartbeats.filter(h => h.status === "FAILED").length;
    const avgDuration = performance.reduce((a, b) =>
      a + b.duration_sec, 0) / performance.length;

    return [{
      json: {
        total_executions: totalRuns,
        failures: failures,
        success_rate: Math.round((totalRuns - failures) / totalRuns * 100),
        avg_duration_sec: Math.round(avgDuration),
        slowest: performance.sort((a, b) =>
          b.duration_sec - a.duration_sec)[0]
      }
    }];
    │
    ▼
[Slack/Email: Send daily digest]

Daily Digest Format

code

📊 *Automation Daily Digest — {{$now.format('YYYY-MM-DD')}}*

*Overall Health:* ✅ Healthy / ⚠️ Degraded / 🔴 Issues

*Executions:* {{total}} runs ({{success_rate}}% success)
*Failures:* {{failures}} ({{failure_details}})
*Avg Duration:* {{avg_duration}}s
*Slowest:* "{{slowest_name}}" took {{slowest_duration}}s

*Workflows Summary:*
✅ Order Sync — 96 runs, 0 failures
✅ Daily Report — 1 run, OK
⚠️ Inventory Check — 4 runs, 1 timeout (retried OK)
✅ Lead Nurture — 24 runs, 0 failures

*Action Items:*
- None / [List any manual interventions needed]

Setting Up Slack for n8n

Quick Slack Setup

code

1. Go to api.slack.com/apps → Create New App
2. From Scratch → Name: "n8n Alerts" → Select workspace
3. OAuth & Permissions → Add scopes:
   - chat:write
   - chat:write.public
4. Install to Workspace → Copy Bot Token (xoxb-...)
5. In n8n: Credentials → Add → Slack API
   → Paste the Bot Token
6. Create a #automation-alerts channel in Slack
7. Invite the bot: /invite @n8n Alerts

Alternative: WhatsApp Alerts (Pakistan-Friendly)

If your clients don't use Slack, use WhatsApp via WATI:

code

[HTTP Request: WATI API]
    Method: POST
    URL: https://live-server.wati.io/api/v1/sendTemplateMessage
    Headers:
      Authorization: Bearer {{$env.WATI_API_KEY}}
    Body:
      {
        "template_name": "automation_alert",
        "broadcast_name": "n8n_alert",
        "receivers": [
          {
            "whatsappNumber": "923001234567",
            "customParams": [
              { "name": "workflow", "value": "{{$json.workflow_name}}" },
              { "name": "error", "value": "{{$json.error_msg}}" }
            ]
          }
        ]
      }

Practice Lab

Task 1: Set Up Error Workflow Create the Master Error Handler workflow. Configure it to send alerts to your email (or Slack if you have it). Assign it as the Error Workflow for at least 2 of your existing workflows. Test by intentionally breaking a node.

Task 2: Build a Heartbeat Monitor Add heartbeat logging to 3 workflows (or create 3 simple test workflows with Schedule Triggers). Build the heartbeat monitor that checks if all heartbeats arrived on time. Test by disabling one workflow and verifying the alert fires.

Task 3: Create a Daily Digest Build a daily digest workflow that summarizes the last 24 hours of workflow activity. Even if you only have 2-3 workflows, the pattern is the same. Send the digest to your email every morning.

Pakistan Case Study

Meet Saad — manages automation for a digital agency in Lahore serving 8 client accounts.

His monitoring problem: Running 23 workflows across 8 clients. No monitoring. Found out about failures from angry client calls. Spent Monday mornings firefighting weekend failures.

His monitoring implementation:

Phase 1 (Week 1): Error Workflow

Built Master Error Handler with Slack + WhatsApp alerts
Assigned to all 23 workflows
Immediate result: caught 3 failures on Day 1 he wouldn't have noticed

Phase 2 (Week 2): Heartbeat Monitor

Added heartbeat to 12 critical workflows (order sync, lead nurture, reporting)
Monitor checks every 30 minutes
Caught a "silent death" on Day 3 — a client's Shopify webhook workflow had stopped receiving triggers (Shopify webhook had unregistered itself)

Phase 3 (Week 3): Daily Digest

Morning report at 9 AM to Slack + email
Shares weekly summary with clients every Friday
Clients love seeing "Your automation ran 312 times this week with 99.7% success rate"

Phase 4 (Week 4): Performance Tracking

Added execution time logging
Discovered one workflow was gradually slowing down (30s → 90s → 180s over 2 weeks)
Root cause: Google Sheets was getting too large — added archival workflow

Results after 1 month of monitoring:

Undetected failures: 4-6/week → 0
Client complaints about automation: 2-3/week → 0
Monday firefighting: 3-4 hours → 15 minutes
Proactive fixes (caught before client noticed): 8 incidents in first month
Added "Monitoring & SLA" as a line item on his proposals: +PKR 15,000/month per client
Total monitoring revenue: PKR 120,000/month (8 clients × PKR 15,000)

His pitch to clients: "Your automations run 24/7. Without monitoring, you only find out about problems when customers complain. With monitoring, I catch problems in 2 minutes — before anyone notices."

Key Takeaways

Four monitoring layers: execution alerts, heartbeat, performance, daily digest
n8n's Error Workflow feature catches failures — assign it to EVERY production workflow
Heartbeat monitoring catches silent deaths (workflows that stop running without errors)
Performance tracking catches degradation before it becomes failure
Daily digests give you a single-glance view of your automation health
Slack is ideal for technical alerts; WhatsApp (WATI) for client-facing alerts
Monitoring is a sellable service — charge PKR 10,000-15,000/month per client
Classify alerts by severity: Critical (immediate), Warning (1 hour), Info (no action)
The morning digest alone saves hours of manual checking across multiple clients

Next lesson: Debugging complex workflows — a systematic approach to finding and fixing problems fast.

7.2 — Workflow Monitoring with Slack/Email Notifications

Workflow Monitoring with Slack/Email Notifications

The Monitoring Stack

Layer 1: Execution Alerts

n8n Error Workflow (Built-In)

Error Alert Workflow Design

Slack Message Format

Layer 2: Heartbeat Monitoring

The Heartbeat Pattern

Layer 3: Performance Tracking

Execution Time Monitoring

Items Processed Tracking

Layer 4: Daily Digest

The Morning Report

Daily Digest Format

Setting Up Slack for n8n

Quick Slack Setup

Alternative: WhatsApp Alerts (Pakistan-Friendly)

Practice Lab

Pakistan Case Study

Key Takeaways

Lesson Summary

Quiz: Workflow Monitoring with Slack/Email Notifications