Nginx Monitoring: Uptime, Errors, and Performance

Nginx handles the first connection for millions of websites. It serves static files, terminates SSL, balances load across backends, and proxies requests to application servers. When Nginx has a problem, everything behind it is unreachable.

A misconfigured upstream block, an expired SSL certificate, a full disk preventing log writes, or a worker process crash can take down your entire stack — even if the application server is perfectly healthy.

This guide covers how to monitor Nginx from the outside (what users experience) and what to watch for so you catch problems before they cascade.

Why Nginx Monitoring Matters

Nginx sits at the edge of your infrastructure. It is the first thing a user's browser connects to and the last thing between your application and the internet. This position makes it both critical and a single point of failure.

Common Nginx failure scenarios:

Configuration error after reload — A typo in nginx.conf causes Nginx to serve the wrong content or return 502 errors after a reload
Upstream server unreachable — Nginx cannot connect to the backend application, returning 502 Bad Gateway
SSL certificate expired — Browsers refuse to connect, showing security warnings
Worker process crash — Under high load or memory pressure, worker processes die and connections are dropped
Disk full — Nginx cannot write logs or cache files, causing unpredictable behavior
Rate limiting misconfigured — Legitimate users are blocked by overly aggressive rate limits
DNS resolution failure — Nginx cannot resolve upstream hostnames at startup or during reload

Each of these can happen while the underlying application is running fine. Monitoring only the application misses the entire Nginx layer.

What to Monitor

1) HTTP Endpoint Availability

The most important check: can a user reach your site through Nginx?

HTTPS check on your domain — Verify Nginx returns a 200 status on your main URL
Content validation — Confirm the response contains expected content, not an Nginx error page
Multiple endpoints — Check both static assets and proxied paths to verify both Nginx and the upstream

An HTTP check catches the majority of Nginx failures because Nginx is the component serving the response. If Nginx is down, misconfigured, or cannot reach the upstream, the check fails.

2) SSL Certificate Health

Nginx typically handles SSL termination. Monitor:

Certificate expiry — Alert at least 14 days before expiry so you have time to renew
Certificate chain — Intermediate certificates must be correctly configured or some browsers will reject the connection
Certificate mismatch — The certificate must match the domain being served
Protocol and cipher support — Outdated TLS versions can be exploited or rejected by modern browsers

SSL failures are the most common Nginx-related outage that monitoring catches early. A certificate that expires at 3 AM on a Saturday will be detected by monitoring within 1 minute.

3) Response Time

Nginx should add minimal latency to requests. Track response times to detect:

Upstream slowness — If the application server is slow, Nginx passes that latency through to users
Proxy buffer issues — Misconfigured proxy buffers cause Nginx to spool to disk, adding latency
Connection queue buildup — When worker_connections is exhausted, new connections wait
Cache misses — If you use Nginx caching, a cache invalidation can cause a sudden spike in response times as the upstream is hit directly

Set response time alerts at a threshold that matches your normal baseline. If your site normally responds in 200ms and suddenly takes 2 seconds, something changed.

4) HTTP Status Codes

Monitor for specific error codes that indicate Nginx-level problems:

Status Code	What It Means in Nginx Context
502 Bad Gateway	Nginx cannot connect to the upstream server. App is down or unreachable.
503 Service Unavailable	Nginx is rate limiting, or the upstream is marked as unavailable.
504 Gateway Timeout	The upstream server took too long to respond. Nginx gave up waiting.
499	Client closed the connection before Nginx finished. Often indicates slow responses.
413 Request Entity Too Large	`client_max_body_size` is too small for the request.
444	Nginx closed the connection without sending a response (used to drop malicious requests).

Content validation on your monitoring checks should verify you are getting the expected response, not an Nginx error page that still returns 200 (which happens with custom error pages).

5) Port Availability

Monitor that Nginx is listening on the expected ports:

Port 80 (HTTP) — Should either serve content or redirect to HTTPS
Port 443 (HTTPS) — Primary SSL-terminated port
Custom ports — If you run Nginx on non-standard ports for internal services

A TCP port check detects Nginx process crashes, bind failures, and firewall changes faster than an HTTP check because it does not wait for a full response.

6) DNS Resolution

If your domain points to the server running Nginx, monitor DNS:

A/AAAA records — Verify the domain resolves to the correct IP
Multiple DNS providers — If you use DNS failover, verify both resolve correctly
TTL changes — Unexpected TTL changes may indicate DNS hijacking

Common Nginx Configurations and What to Monitor

Nginx as Reverse Proxy

The most common setup — Nginx in front of Node.js, Python, Ruby, PHP, or Go applications:

upstream backend {
    server 127.0.0.1:3000;
    server 127.0.0.1:3001;
}

server {
    listen 443 ssl;
    server_name example.com;

    location / {
        proxy_pass http://backend;
    }
}

Monitor:

HTTP check on the public URL (catches both Nginx and upstream failures)
TCP port check on 443 (catches Nginx process failures)
Response time (catches upstream slowness proxied through Nginx)
SSL certificate (catches expiry and misconfiguration)

Nginx as Load Balancer

When Nginx distributes traffic across multiple backends:

upstream app_servers {
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
}

Monitor:

HTTP check from multiple regions (verify load balancing works globally)
Content validation (ensure all backends serve correct content — a misconfigured backend in the pool serves wrong data for some requests)
Response time (a slow backend in the pool increases average latency)

Nginx Serving Static Files

When Nginx serves a static site directly:

server {
    listen 443 ssl;
    root /var/www/html;
    index index.html;
}

Monitor:

HTTP check on key pages (homepage, important landing pages)
Content validation (verify files are served correctly, not 403 or directory listing)
Disk space indirectly — if the disk is full, Nginx cannot write temp files and may fail

Nginx with Caching

When Nginx caches upstream responses:

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=cache:10m;

location / {
    proxy_cache cache;
    proxy_pass http://backend;
}

Monitor:

Content freshness — Verify cached content is not stale beyond acceptable limits
Response time — Cache misses cause latency spikes
Post-deploy content validation — After deploying new content, verify the cache updates

Nginx Failure Modes and Detection

Failure Mode	User Impact	Detection Method
Nginx process not running	Site completely down	TCP port check + HTTP check
Configuration syntax error after reload	Previous config still active (if Nginx caught it) or partial failure	HTTP check + content validation
SSL certificate expired	Browser security warning, site inaccessible over HTTPS	SSL monitoring
Upstream server down	502 Bad Gateway errors	HTTP check + status code validation
Upstream server slow	Slow page loads, potential 504 timeouts	Response time monitoring
Worker connections exhausted	New connections rejected or queued	Response time monitoring + availability check
Disk full	Log write failures, cache failures, unpredictable behavior	HTTP check + content validation
DNS misconfiguration	Domain resolves to wrong IP	DNS monitoring
Rate limiting too aggressive	Legitimate users get 429 or 503	Multi-region HTTP checks (detect if one region is being limited)
Proxy buffer misconfiguration	Large responses truncated or very slow	Content validation on pages with dynamic content
`client_max_body_size` too small	File uploads fail with 413	API endpoint monitoring with payload validation

Monitoring Nginx Across Environments

Production

1-minute HTTP checks on all public endpoints
SSL certificate monitoring with 14-day expiry alerts
TCP port checks on 80 and 443
Response time alerts with tight thresholds
DNS monitoring on all production domains
Multi-region checks to verify global availability

Staging

5-minute HTTP checks on primary endpoints
SSL monitoring (staging certs expire too)
Content validation to catch configuration drift between staging and production

Development / Preview

Basic HTTP check to verify the environment is accessible
Useful for catching Nginx misconfigurations before they reach production

Troubleshooting with Monitoring Data

When monitoring detects an Nginx issue, the alert context points you to the right place:

HTTP check fails with connection refused: → Nginx process is not running. Check systemctl status nginx or your container orchestrator.

HTTP check returns 502: → Nginx is running but cannot reach the upstream. Check the application server, verify the upstream block in nginx.conf, and check network connectivity.

HTTP check returns 504: → Upstream is too slow. Check application performance, database queries, and proxy_read_timeout setting.

SSL check fails: → Certificate expired, chain incomplete, or wrong certificate served. Check ssl_certificate and ssl_certificate_key paths in the server block.

Response time suddenly doubled: → Possible upstream degradation, cache invalidation, increased traffic, or Nginx configuration change. Check recent deployments and upstream health.

Content validation fails but status is 200: → Nginx is serving a custom error page or fallback content. The upstream may be down but Nginx is returning a friendly error page. Check the upstream and any error_page directives.

How Webalert Helps

Webalert monitors your Nginx-served endpoints the way users experience them:

60-second HTTP checks from global regions — detect Nginx failures within 2 minutes
SSL monitoring — alerts before certificates expire, catches chain and mismatch issues
TCP port monitoring — detect Nginx process crashes independent of HTTP
Response time tracking — catch upstream slowness and proxy configuration issues
Content validation — verify Nginx serves correct content, not error pages
DNS monitoring — detect resolution issues before they affect users
Multi-region checks — verify Nginx serves correctly from every geography
Multi-channel alerts — Email, SMS, Slack, Discord, Teams, webhooks

See features and pricing for details.

Summary

Nginx is the front door to your infrastructure. When it fails, everything behind it is unreachable.
Monitor HTTP endpoints through Nginx, not just the application behind it.
SSL certificate monitoring prevents the most common scheduled Nginx outage.
Response time tracking catches upstream problems that Nginx proxies to users.
TCP port checks detect Nginx process crashes faster than HTTP checks.
Content validation catches cases where Nginx returns 200 with wrong content.
Monitor across environments — production, staging, and preview.

Nginx handles the connection. Monitoring proves it is handling it correctly.

Monitor every Nginx endpoint from the outside

Start monitoring with Webalert →

See features and pricing. No credit card required.

Nginx Monitoring: Uptime, Errors, and Performance

Why Nginx Monitoring Matters

What to Monitor

1) HTTP Endpoint Availability

2) SSL Certificate Health

3) Response Time

4) HTTP Status Codes

5) Port Availability

6) DNS Resolution

Common Nginx Configurations and What to Monitor

Nginx as Reverse Proxy

Nginx as Load Balancer

Nginx Serving Static Files

Nginx with Caching

Nginx Failure Modes and Detection

Monitoring Nginx Across Environments

Production

Staging

Development / Preview

Troubleshooting with Monitoring Data

How Webalert Helps

Summary

Monitor every Nginx endpoint from the outside

Related Articles

Load Testing vs Monitoring: What's the Difference and When Do You Need Each?

1-Minute vs 5-Minute Monitoring Check Intervals

AI API Monitoring: OpenAI, Anthropic, and Gemini Uptime

Ready to Monitor Your Website?