Skip to content

How to Evaluate a Vendor SLA: What to Look For

Webalert Team
June 9, 2026
9 min read

How to Evaluate a Vendor SLA: What to Look For

"They offer a 99.9% uptime SLA" is the line that ends most vendor reliability discussions. It shouldn't even start one. A service level agreement is a legal-commercial document with a dozen levers that matter far more than the headline percentage — how "available" is defined, what's excluded, what you actually get when it's breached, and whether the burden of proving the breach falls on you. Two vendors can both advertise "99.9%" and offer wildly different real protection.

This guide is a buyer's checklist: how to read and evaluate a vendor SLA before you sign, the fine print that quietly guts the guarantee, and the questions to ask. It's the procurement-side companion to Cloud SLAs compared (what AWS/Azure/GCP actually promise) and uptime SLA explained (the percentage-to-downtime math).


First: An SLA Is a Refund Policy, Not a Guarantee

The most important reframe before you evaluate anything. An SLA does not promise the service won't go down. It defines a target and what the vendor owes you if they miss it — almost always service credits, not cash and never your lost revenue.

So you're not evaluating "will this stay up?" You're evaluating two separate things:

  1. How confident is the vendor? — expressed as the target and the strength of its definitions.
  2. What protection do I actually have when they miss? — the credits, the exclusions, and the claim process.

A generous-sounding percentage wrapped in broad exclusions and a one-sided claim process is worth less than a lower number with honest definitions and automatic credits. Read for the whole machine, not the number.


The Headline Number — and Why It's the Least Important Part

Start by translating the percentage into concrete allowed downtime, because "99.9%" doesn't feel like anything until you do:

SLA Downtime / month Downtime / year
99% ("two nines") ~7h 18m ~3d 15h
99.9% ("three nines") ~43m 50s ~8h 46m
99.95% ~21m 54s ~4h 23m
99.99% ("four nines") ~4m 23s ~52m 36s
99.999% ("five nines") ~26s ~5m 15s

Use the downtime calculator guide for any value in between. Note the jump from 99% to 99.9% is an order of magnitude — vendors quoting a bare "99%" are promising over three and a half days of allowed annual downtime. Once you've translated the number, set it aside. The definitions decide whether it means anything.


What to Actually Look For

1) How is "available" defined?

This is the single highest-leverage clause. A vendor measures their definition of up, which can be much narrower than your users' experience:

  • What counts as downtime? Total unreachability only, or also severe degradation, elevated error rates, and slow responses? Many SLAs only count a complete outage, so a service throwing 30% errors or running 10x slow is "available."
  • Measured from where? Their internal monitoring, a single region, or true outside-in checks? Self-measured availability flatters the vendor.
  • What's the measurement window? Monthly is standard and friendlier to you; an annual window lets a long outage hide inside a big denominator.
  • Per-service or whole-platform? A platform-wide number can stay green while the specific API you depend on is down.

2) What do you get when they miss — and is it automatic?

  • Credit size and tiers. Usually a percentage of the affected fee, scaling with how badly the target was missed. Model the worst case: a catastrophic month often caps at 25–50% of that service's monthly fee — typically pennies against your real loss.
  • Credits vs. cash. Almost always credits against future bills. If you're planning to leave, a credit is worthless.
  • Automatic or claim-based? The best SLAs apply credits automatically. Most make you detect the breach, document it, and file a claim within a window (often 30–60 days) — miss it and you forfeit a legitimate credit.
  • Is there an exit right? Strong SLAs include a termination-for-chronic-failure clause (e.g. three breaches in a quarter lets you leave without penalty). That's worth more than the credits.

3) The exclusions — where guarantees go to die

Every SLA carves out categories that don't count against the vendor. Read these first; they define the real guarantee:

  • Scheduled maintenance. How much notice? Is there a cap on maintenance hours? Unbounded "scheduled" windows can swallow your uptime legitimately.
  • Beta / preview features — usually no SLA at all. Know which parts of what you're buying are uncovered.
  • Third-party and "outside our control" — internet, DNS, upstream providers, force majeure. Reasonable, but watch how broadly it's drawn.
  • Your fault — misconfiguration, exceeding limits, your own integration. Fair, but make sure it's not written so broadly it excuses the vendor's own errors.
  • Suspended/unpaid accounts — no credits while in dispute or arrears.

A short, specific exclusions list is a good sign. A long, vague one ("including but not limited to…") means the number is mostly decorative.

4) Support, response, and resolution times

Availability isn't the only commitment. Look for:

  • Response-time targets by severity (time to first human response) — and whether they're guaranteed or "targets."
  • Resolution-time / restoration expectations, and how severity levels are defined (who decides a P1?).
  • Support coverage — 24/7 vs business hours, and in whose timezone.
  • Incident communication — do they commit to status-page updates and post-incident reviews? See incident communication.

5) Definitions you should map to your own metrics

If the SLA mentions MTTR, RTO, or RPO, make sure they mean what you think — see reliability metrics explained and RTO vs RPO. For anything involving data, the recovery objectives (how much data loss, how long to restore) often matter more than the uptime percentage.


Composite SLAs: Your Real Target Is Lower Than Any One Vendor's

The calculation almost no one runs. When your product depends on several vendors in series, their availabilities multiply:

Effective SLA = SLA_vendor_A × SLA_vendor_B × SLA_vendor_C × ...

Three "99.9%" dependencies in your critical path:

0.999 × 0.999 × 0.999 = 0.997  →  99.7%

Three three-nines vendors chained together yield 99.7% — roughly 26 hours of allowed annual downtime, not 8h 46m. Add an auth provider, a payment gateway, a CDN, and an email service, and your real externally-imposed ceiling erodes further. Evaluate the SLA of every vendor in your request path, then compute the product. That number — not any single vendor's promise — is your reliability ceiling, and it's where you set your own SLO and error budget.


Questions to Ask the Vendor

Bring these to the procurement call; the answers (and how readily they're given) tell you more than the document:

  • Exactly what conditions count as "downtime," and is degraded performance included?
  • Is availability measured per-service or platform-wide, and from where?
  • Are credits applied automatically, or must we file a claim — and what's the window?
  • What's the maximum credit in a catastrophic month, as a real dollar figure?
  • Can we terminate without penalty after repeated breaches?
  • How much scheduled-maintenance time is excluded, and how much notice do we get?
  • What are your response and restoration targets by severity, and are they guaranteed?
  • Can you share the last 12 months of actual uptime and a recent post-incident review?
  • Will you measure against our monitoring as well as your own?

That last point matters: a vendor confident in their reliability won't object to you holding independent evidence.


Quick Reference

  • An SLA is a service-credit refund policy, not a guarantee of uptime or your losses.
  • The definition of "available" and the exclusions matter more than the headline percentage.
  • Prefer automatic credits, a monthly window, and a termination-for-chronic-failure clause.
  • Model the worst-case credit in real dollars — it's usually tiny next to outage cost.
  • Composite SLAs multiply down — your real ceiling is the product of every vendor in the path.
  • You can only claim what you can prove — independent monitoring is the evidence.

How Webalert Helps

An SLA is only worth what you can detect and prove. Webalert is the independent witness that turns a vendor's promise into something enforceable:

  • Outside-in uptime monitoring that measures what your users (and the vendor) actually experience — not the vendor's self-graded metric.
  • Multi-region checks to catch regional and last-mile failures a vendor's internal monitoring ignores — see multi-region monitoring.
  • Timestamped incident history you can attach to a service-credit claim before the window closes.
  • Response-time and error monitoring so you can hold a vendor to a degradation clause, not just total outages.
  • Status pages and SLA reporting to track each vendor's real delivered availability against the number they promised.

Pair it with the downtime cost calculator guide so you can size redundancy — and judge any SLA's credits — against what an outage actually costs you.


Summary

A vendor SLA is a refund policy dressed as a guarantee. Evaluating one well means looking past the percentage to how "available" is defined, what's excluded, what you actually receive on a breach, and whether you or the vendor bears the burden of proof. Translate the number into real downtime, model the worst-case credit in dollars, multiply the SLAs of every vendor in your path to find your true ceiling, and insist on definitions you can independently verify.

Sign the SLA for the definitions and the exit rights, not the headline number — and monitor independently, because the only availability figure that protects you is the one you can prove.


Hold every vendor to the SLA they promised

Start monitoring with Webalert ->

See features and pricing. No credit card required.

Monitor your website in under 60 seconds — no credit card required.

Start Free Monitoring

Written by

Webalert Team

The Webalert team is dedicated to helping businesses keep their websites online and their users happy with reliable monitoring solutions.

Ready to Monitor Your Website?

Start monitoring for free with 3 monitors, 10-minute checks, and instant alerts.

Start Free Monitoring