alerting notifications best-practices incident-response monitoring

Alert Fatigue: How to Set Up Notifications That Actually Get Acted On

Webalert Team
December 11, 2025
11 min read

Alert Fatigue: How to Set Up Notifications That Actually Get Acted On

Your phone buzzes. Another monitoring alert.

You glance at it — probably another false positive. You swipe it away without reading. Ten minutes later, it buzzes again. Same thing. Swipe. Dismiss. Ignore.

Then your customer tweets: "Is your site down? I can't checkout."

You check. It's been down for 47 minutes. The alerts were real this time. You just didn't notice because you've learned to ignore them.

This is alert fatigue — and it's silently sabotaging your monitoring.

You set up monitoring to catch problems before customers do. But when every minor hiccup triggers a notification, your brain learns to tune them out. The result? The alerts that matter get buried in noise.

In this guide, we'll cover how to set up notifications that your team will actually respond to — without missing the issues that matter.


What Is Alert Fatigue?

Alert fatigue happens when the volume of alerts exceeds a team's capacity to respond meaningfully to each one.

It's a well-documented phenomenon in healthcare, where alarm fatigue contributes to patient deaths when nurses become desensitized to monitor beeps. The same psychology applies to DevOps and engineering teams.

Here's what happens:

  1. Too many alerts flood your channels
  2. Most are noise — false positives, minor issues, or things that resolve themselves
  3. Your brain adapts by deprioritizing all alerts
  4. Real incidents get missed because they look like everything else

Studies show that when alert volume is high:

  • 70-99% of alerts are ignored in high-volume environments
  • Response times increase as teams become desensitized
  • Critical alerts take longer to acknowledge — sometimes hours instead of minutes

The irony? Teams with the most monitors often have the worst incident response times.


Signs Your Team Has Alert Fatigue

How do you know if alert fatigue is affecting your team? Here are the warning signs:

The symptoms checklist

  • Alerts regularly go unacknowledged for 30+ minutes
  • Team members have muted or filtered notification channels
  • "I assumed it was a false positive" is a common post-incident phrase
  • The same alerts fire repeatedly without anyone investigating
  • On-call engineers feel burned out from constant notifications
  • You find out about outages from customers, not monitoring
  • Alert channels have hundreds of unread messages
  • Nobody remembers the last time they acted on a warning alert

If you checked more than two of these, alert fatigue is likely affecting your incident response.


The Root Causes of Alert Fatigue

Alert fatigue doesn't happen randomly. It's usually the result of specific configuration mistakes:

1. One-size-fits-all thresholds

Every monitor gets the same alert threshold — "notify me if response time exceeds 2 seconds."

But a 2-second response time on your marketing blog is fine. On your checkout page? That's a crisis. When everything alerts the same way, nothing feels urgent.

2. Alerting on warnings instead of problems

Warnings are useful for dashboards and trend analysis. They're terrible for notifications.

If you alert on every warning-level event, your team drowns in "something might be slightly off" messages. Save notifications for things that actually need human attention.

3. No severity tiering

When every alert has the same priority, none of them have priority.

Critical payment processing failures shouldn't look the same as a slow-loading blog image. But without severity levels, they do.

4. Flapping monitors

A service that goes down for 30 seconds, recovers, then goes down again generates a storm of alerts:

  • DOWN at 14:00
  • UP at 14:01
  • DOWN at 14:01
  • UP at 14:02
  • DOWN at 14:02
  • ...

Each state change triggers a notification. Your phone buzzes 10 times in 5 minutes. You stop paying attention.

5. Alerting the wrong people

When alerts go to a shared channel that "everyone" monitors, nobody feels responsible.

Diffusion of responsibility means each person assumes someone else will handle it. The alert sits there, unacknowledged.

6. No maintenance windows

Deploying updates? Running database migrations? If alerts fire during expected maintenance, they train your team to ignore alerts during unexpected outages too.


How to Choose the Right Notification Channel

Different channels are appropriate for different situations. Here's how to match them:

Channel Best For Response Time Intrusiveness
SMS Critical issues requiring immediate action Seconds High
Email Non-urgent alerts, daily summaries, documentation Hours Low
Slack/Discord Team awareness, collaborative troubleshooting Minutes Medium
Webhooks Automated responses, ticketing integration Instant None (automated)
Microsoft Teams Enterprise team notifications Minutes Medium

When to use SMS

Reserve SMS for genuine emergencies:

  • Production site completely down
  • Payment processing failing
  • Security incidents
  • SLA-threatening events

SMS should mean "drop what you're doing." If you're sending SMS for warnings, you're training your team to ignore text messages.

When to use email

Email is for things that need attention but not immediately:

  • SSL certificates expiring in 14+ days
  • Weekly uptime reports
  • Performance trend summaries
  • Non-critical service degradation

Email creates a paper trail without demanding immediate attention.

When to use Slack/Discord

Team chat is ideal for:

  • Real-time incident coordination
  • Alerts that benefit from team visibility
  • Issues where multiple people might need to collaborate
  • Non-critical production alerts during business hours

Keep a dedicated alerts channel. Don't mix alerts with general conversation — they'll get lost.

When to use webhooks

Webhooks shine for automation:

  • Creating tickets in your issue tracker
  • Triggering automated remediation scripts
  • Updating external dashboards
  • Feeding data to incident management tools

Webhooks don't cause alert fatigue because they don't interrupt humans directly.


Setting Up Effective Alert Thresholds

The key to avoiding alert fatigue is alerting less but alerting smarter.

Distinguish warning from critical

Severity Definition Notification
Critical Customer-impacting, revenue-affecting, needs immediate action SMS + Slack
Warning Degraded performance, potential issue, needs investigation soon Email or Slack only
Info Notable event, no action needed, useful for context Dashboard only (no notification)

Most monitoring tools send the same alert regardless of severity. Configure yours to differentiate.

Set thresholds by page importance

Not all pages deserve the same thresholds:

Page Type Response Time Warning Response Time Critical
Checkout/Payment > 1.5s > 3s
Core App Features > 2s > 4s
Marketing Pages > 3s > 6s
Blog/Content > 4s > 8s

Your checkout being slow is an emergency. Your blog being slow is a todo item.

Use confirmation checks

Don't alert on the first failure. Network blips happen.

Most good monitoring tools can be configured to:

  1. Detect a failure
  2. Wait and check again (confirmation check)
  3. Only alert if the second check also fails

This eliminates most false positives while adding only 1-2 minutes to detection time.

Set percentage-based thresholds

Instead of alerting when response time exceeds X once, alert when it exceeds X for Y% of checks over Z minutes.

Example: "Alert when response time exceeds 3 seconds for more than 50% of checks over a 5-minute window."

This catches sustained problems while ignoring momentary spikes.


Building an Escalation Strategy

Not every alert needs to wake up your CTO. Build escalation tiers:

Tier 1: First responder (immediate)

  • Primary on-call engineer
  • Gets SMS + Slack for critical alerts
  • Expected response: acknowledge within 5 minutes
  • Responsibility: initial triage and either fix or escalate

Tier 2: Backup responder (5-10 minutes)

  • Secondary on-call or team lead
  • Notified if Tier 1 doesn't acknowledge within 10 minutes
  • Gets SMS
  • Responsibility: take over if primary is unavailable

Tier 3: Leadership (15-30 minutes)

  • Engineering manager or CTO
  • Notified if issue isn't resolved within 30 minutes
  • Gets email + SMS for extended outages
  • Responsibility: resource allocation, customer communication decisions

For small teams

If you're a team of 2-3, escalation still matters:

  1. Primary contact for the week
  2. Backup contact who gets notified after 15 minutes
  3. Everybody gets notified after 30 minutes of unresolved critical issues

Rotate primary responsibility weekly to prevent burnout.


Alert Hygiene Best Practices

Maintaining healthy alerts requires ongoing attention:

Review thresholds monthly

What made sense six months ago might not make sense now. Traffic patterns change. Infrastructure scales. Review your thresholds regularly:

  • Are any monitors triggering too often?
  • Are any never triggering? (Maybe thresholds are too loose)
  • Has anything changed about what's critical?

Group related alerts

If your database goes down, you'll get alerts from:

  • The database monitor
  • Every application that depends on the database
  • Every API endpoint that queries the database

That's potentially dozens of alerts for one root cause. Use alert grouping or suppression to surface one "Database down" alert, not fifty "endpoint failed" alerts.

Schedule maintenance windows

Before planned maintenance:

  1. Pause affected monitors or suppress alerts
  2. Communicate the maintenance window to the team
  3. Re-enable monitoring after maintenance completes
  4. Verify everything is working before walking away

This prevents "cry wolf" situations during expected downtime.

Clean up stale monitors

Decommissioned a service? Removed a feature? Delete the monitor.

Old monitors for things that no longer exist (or no longer matter) add noise without value. Audit your monitor list quarterly.

Create runbooks for common alerts

When an alert fires, the responder should know:

  • What does this alert mean?
  • What's the likely cause?
  • What are the first three troubleshooting steps?
  • When should this be escalated?

Document this for each critical alert. Faster triage means faster resolution.


The Alert Volume Formula

Here's a simple rule of thumb:

If your team receives more than 5-10 actionable alerts per day, you have too many alerts.

Actionable means someone should do something about it. Not "noted for later." Not "interesting." Actually do something.

More than that, and alerts become background noise. Fewer than that, and each alert gets the attention it deserves.

Work backward from this target:

  1. Count your current daily alert volume
  2. Identify which alerts are actually actionable
  3. Either eliminate non-actionable alerts or change them to not notify
  4. Repeat until you're under the threshold

How Webalert Helps You Avoid Alert Fatigue

Webalert is designed with alert hygiene in mind:

Multiple notification channels

Route critical alerts to SMS while sending warnings to email. Match the channel to the severity without complex configuration.

Per-monitor configuration

Set different thresholds, check intervals, and notification channels for each monitor. Your checkout page can alert differently than your blog.

Built-in confirmation checks

Webalert automatically confirms failures before alerting, eliminating most false positives from network blips.

Team notifications

Add multiple recipients to alerts. Route different monitors to different team members based on ownership.

Status pages

Public status pages reduce "Is it down?" questions from customers and teammates. Fewer manual checks means less fatigue.

Clean, simple alerting

No complex alert rules to configure. No enterprise bloat. Just straightforward notifications when things actually break.


Quick Alert Health Check

Answer these questions:

  1. How many alerts did your team receive in the last 7 days?
  2. How many of those required action?
  3. What's your average time to acknowledge a critical alert?
  4. When did you last review and adjust your thresholds?
  5. Do you have different severity levels configured?

If your answers concern you, it's time to tune your alerting.


Final Thoughts

The goal of monitoring isn't to generate alerts. It's to catch problems before they hurt your customers.

When alerts become noise, they stop doing their job. Your team learns to ignore them, and you're back to finding out about outages from angry tweets.

The fix isn't monitoring less. It's monitoring smarter:

  • Alert on what matters, not everything
  • Use the right channel for each severity
  • Set thresholds that reflect actual business impact
  • Build escalation paths that ensure coverage
  • Maintain your alerts like you maintain your code

Done right, each alert your team receives is worth their immediate attention. And when that critical alert fires at 3 AM, they'll actually wake up and respond — because they trust that it matters.


Ready to set up alerts that actually get acted on?

Start monitoring for free with Webalert →

Multi-channel notifications. Smart alerting. No noise.

Written by

Webalert Team

The Webalert team is dedicated to helping businesses keep their websites online and their users happy with reliable monitoring solutions.

Ready to Monitor Your Website?

Start monitoring for free with 5 monitors, 1-minute checks, and instant alerts.

Get Started Free