
"99.9% uptime" sounds impressive. So does "99.99%." But what do those numbers actually mean for your business — and is the extra nine worth the extra cost?
Most teams pick an SLA without understanding the real difference in allowed downtime. A single percentage point can mean hours of outage per year. In this guide, we'll break down uptime SLAs so you can choose the right target and hold providers (and yourself) accountable.
What Is an Uptime SLA?
A Service Level Agreement (SLA) is a contract that defines the expected level of service. For uptime, it specifies:
- Availability target — e.g., 99.9%
- How downtime is measured — usually from the customer's perspective
- Exclusions — planned maintenance, force majeure, customer-side issues
- Remedies — credits or penalties if the target is missed
The availability percentage is uptime over a period, usually calculated monthly or annually:
Availability % = (Total time − Downtime) / Total time × 100
The Downtime Math: What Each Tier Allows
A year has 525,600 minutes. Here's how much downtime each SLA tier permits:
| SLA | Allowed Downtime/Year | Allowed Downtime/Month | Downtime per Week |
|---|---|---|---|
| 99% | 3 days, 15 hours | ~7.3 hours | ~1.7 hours |
| 99.9% ("three nines") | 8.76 hours | ~43.8 minutes | ~10 minutes |
| 99.95% | 4.38 hours | ~21.9 minutes | ~5 minutes |
| 99.99% ("four nines") | 52.56 minutes | ~4.4 minutes | ~1 minute |
| 99.999% ("five nines") | 5.26 minutes | ~26 seconds | ~6 seconds |
99.9% is the most common target for business-critical apps. 99.99% is typical for payment, auth, or core infrastructure. 99.999% is reserved for life-critical or highly regulated systems.
99.9% vs 99.99%: The Practical Difference
Going from 99.9% to 99.99% doesn't sound like much. It's one more nine. But the impact is large:
| Metric | 99.9% | 99.99% |
|---|---|---|
| Downtime per year | 8.76 hours | 52.56 minutes |
| Downtime per month | ~44 minutes | ~4.4 minutes |
| Single longest "acceptable" outage | ~8–9 hours/year total | ~53 minutes/year total |
In practice:
- 99.9% — You can have a few multi-hour incidents per year and still meet the SLA.
- 99.99% — Any outage over ~50 minutes in a year can put you in breach. You need fast detection and resolution.
So 99.99% doesn't just mean "a bit more uptime." It means you must detect and fix issues in minutes, not hours. That usually requires better monitoring, on-call processes, and redundancy.
How SLAs Are Measured (And Gamed)
Monthly vs annual
Many SLAs are calculated per month. That means:
- One bad month can trigger a credit even if the rest of the year was perfect.
- One perfect month doesn't protect you if the next month has a long outage.
Check whether your SLA is monthly or rolling annual.
Excluded downtime
Typical exclusions:
- Planned maintenance (if announced in advance)
- Force majeure (natural disasters, war, etc.)
- Customer-caused (wrong config, DDoS you didn't mitigate)
- Third-party (e.g. "we're not responsible if AWS is down")
Read the exclusions. They often swallow a big share of real-world downtime.
Measurement method
- Probe-based — Provider pings your endpoint from their locations. Can miss regional or routing issues.
- Synthetic monitoring — Scripts simulate user flows. Closer to real experience.
- Real user monitoring (RUM) — Actual user sessions. Most accurate but harder to contract on.
Your own monitoring (e.g. from multiple regions) may show more downtime than the provider's SLA measurement. That's why you need independent monitoring to verify SLAs and improve reliability.
Choosing the Right Uptime Target
When 99.9% is enough
- Internal tools, staging, non-critical APIs
- Blogs, marketing sites, low-transaction sites
- Early-stage products where speed of iteration matters more than perfect uptime
When to aim for 99.99%
- Payment, billing, or checkout flows
- Auth and login
- Core product used by paying customers
- APIs that other businesses depend on
When 99.999% is considered
- Banking, healthcare, safety-critical systems
- Regulated industries with strict availability requirements
- Extremely high revenue per minute of downtime
For most SaaS and web apps, 99.9% is the baseline and 99.99% is the goal for critical paths.
Monitoring Your Own SLA
If you promise 99.9% or 99.99%, you need to:
- Measure continuously — With checks at least every 1–5 minutes so you don't miss short outages.
- Measure from the user's perspective — HTTP(s), key pages, critical APIs.
- Track cumulative downtime — Know how many minutes you've "spent" each month/year.
- Alert before you breach — If you're approaching your allowed downtime, escalate.
Tools like Webalert give you uptime percentages, incident history, and status pages so you can report honestly to customers and improve over time.
SLA Credits: What They're Really Worth
Many providers offer service credits when they miss the SLA (e.g. 10% off next invoice). That can sound good until you do the math:
- Lost revenue or trust from an 8-hour outage often far exceeds 10% of your hosting bill.
- Credits don't fix the incident; they only soften the bill.
Use credits as a minimum. The real goal is fewer and shorter outages through better architecture and monitoring.
Summary: Key Takeaways
- 99.9% ≈ 8.76 hours downtime/year. 99.99% ≈ 53 minutes/year.
- The jump from 99.9% to 99.99% demands much faster detection and recovery.
- Read how the SLA is measured and what's excluded.
- Match the target to the business impact: 99.9% for non-critical, 99.99% for critical paths.
- Measure your own uptime independently and aim to beat your stated SLA.