Skip to content

How to Monitor a Django App in Production

Webalert Team
April 12, 2026
9 min read

How to Monitor a Django App in Production

Your Django app deploys successfully. Gunicorn is running. The admin panel loads. Your homepage returns 200.

But the Celery worker died 30 minutes ago. User signup emails are queued and not sending. The periodic task that syncs data from your third-party API stopped running yesterday. Nobody noticed because the web layer is completely healthy.

Django applications have the same problem as every other framework-based app: the request-response cycle is only one part of what needs to work. Celery workers, beat schedulers, database connections, cache layers, and static file serving can all fail independently while your Django views return 200.

This guide covers everything to monitor in a production Django app so you catch failures across every layer.


What Makes Django Monitoring Different

A production Django app typically has:

  • WSGI/ASGI server — Gunicorn, uWSGI, or Daphne serving Django views
  • Celery workers — Processing background tasks (emails, notifications, data jobs)
  • Celery Beat — Scheduling periodic tasks
  • Database — PostgreSQL, MySQL, or SQLite
  • Cache — Redis or Memcached for sessions, caching, and Celery broker
  • Static and media files — Served via Nginx, WhiteNoise, or a CDN
  • Django management commands — Custom manage.py commands run on schedule

A basic HTTP uptime check on your homepage only validates that Gunicorn is running and Django can render a view. Everything else can be broken.


What to Monitor

1) Web Endpoints and Health Check

The starting point — verify your app responds correctly:

  • Homepage or primary landing page — Content validation, not just status code
  • Login page — Verify authentication renders correctly
  • API endpoints — Test the routes that power your frontend or integrations
  • Django admin (/admin/) — Should load the login form for unauthenticated requests

Create a dedicated health check view that tests internal dependencies:

# urls.py
from django.http import JsonResponse
from django.db import connection
from django.core.cache import cache

def health_check(request):
    try:
        # Test database
        with connection.cursor() as cursor:
            cursor.execute("SELECT 1")
        
        # Test cache
        cache.set("health_check", "ok", timeout=10)
        cache.get("health_check")
        
        return JsonResponse({"status": "healthy"})
    except Exception as e:
        return JsonResponse(
            {"status": "unhealthy", "error": str(e)},
            status=503
        )

Monitor /health/ with response body validation — check for "status": "healthy", not just a 200 status code.

2) Celery Worker Monitoring

Celery workers fail silently. When they stop:

  • Password reset emails are not sent
  • User notifications queue up forever
  • Background data processing halts
  • Webhooks from third-party services are not handled

Monitor Celery workers with a heartbeat task:

# tasks.py
from celery import shared_task
import requests

@shared_task
def celery_heartbeat():
    requests.get(
        "https://heartbeat.web-alert.io/your-celery-worker-id",
        timeout=10
    )
# celery.py or settings/celery.py
from celery.schedules import crontab

app.conf.beat_schedule = {
    "celery-heartbeat": {
        "task": "myapp.tasks.celery_heartbeat",
        "schedule": 300.0,  # Every 5 minutes
    },
}

If the heartbeat does not arrive within the expected interval, the Celery worker is down or stuck.

3) Celery Beat Monitoring

Celery Beat is the scheduler that triggers periodic tasks. It runs as a separate process and can fail independently of the workers. If Beat stops:

  • Scheduled tasks no longer fire
  • Data sync jobs stop running
  • Report generation halts
  • Cleanup tasks (session expiry, cache warming) stop

Monitor Beat with its own heartbeat task — distinct from the worker heartbeat so you can distinguish which process failed:

@shared_task
def beat_heartbeat():
    requests.get(
        "https://heartbeat.web-alert.io/your-celery-beat-id",
        timeout=10
    )

# Schedule it with Beat
"beat-alive": {
    "task": "myapp.tasks.beat_heartbeat",
    "schedule": 300.0,
}

4) Database Connectivity

Django's database layer can fail from:

  • Connection pool exhaustion under concurrent load
  • Database server running out of disk space
  • Slow queries degrading all page performance
  • A migration leaving the schema in an inconsistent state
  • Replica lag if using read replicas

The health check endpoint above covers basic connectivity. Additionally:

  • Response time monitoring — Database slowness shows as increased HTTP response times
  • Monitor data-driven endpoints — An endpoint that queries the database fails or slows when the database has issues

5) Static and Media Files

Django's static file serving breaks in common ways:

  • collectstatic not run after deploy — static files are missing or stale
  • WhiteNoise not configured correctly — /static/ returns 404
  • Nginx misconfigured — media files at /media/ are not served
  • S3 or CDN permissions — uploaded files return 403

Monitor:

  • HTTP check on a known static asset — e.g., https://yourapp.com/static/css/main.css
  • Content validation on pages — Verify pages load correctly (broken static files affect rendering)

6) SSL and Domain

  • SSL certificate monitoring — Alert before expiry, critical for apps using Let's Encrypt
  • DNS monitoring — Verify domain resolution
  • HTTPS redirect — Confirm HTTP redirects to HTTPS

7) Post-Deployment Validation

Django deployments commonly break due to:

  • Missing environment variable in production
  • Migration not run after deploy
  • collectstatic not run
  • Gunicorn/uWSGI not restarted after code update
  • Celery workers not restarted, still running old code

After every deployment:

#!/bin/bash
# deploy.sh
python manage.py migrate --noinput
python manage.py collectstatic --noinput
supervisorctl restart gunicorn celery celerybeat

# Validate the app is healthy
curl -sf https://yourapp.com/health/ || exit 1

# Signal deploy completed
curl -fsS https://heartbeat.web-alert.io/your-deploy-id

Common Django Failure Modes

Failure User Impact Detection Method
Celery worker crashed Emails, jobs, notifications stop Heartbeat from worker task
Celery Beat stopped Periodic tasks stop running Heartbeat from Beat schedule
Database connection exhausted 500 errors on data-driven pages Health endpoint + HTTP monitoring
Cache (Redis) down Slow pages, session loss Health endpoint + response time
Gunicorn not restarted after deploy Old code still running Post-deploy content validation
Missing env variable after deploy Partial functionality, 500 errors HTTP check + content validation
collectstatic not run Broken CSS/JS, missing images Content validation
Migration not applied 500 errors on changed schema Post-deploy health check
SSL certificate expired Browser blocks the site SSL monitoring
Static storage permissions wrong 403 on static/media files HTTP check on static asset URL
Disk full Log write failures, gunicorn crashes HTTP check fails with 503

Monitoring by Deployment Setup

Gunicorn + Nginx (Most Common)

Internet → Nginx (port 80/443) → Gunicorn (port 8000) → Django
  • HTTP check on public URL — Tests entire stack
  • TCP port check on port 8000 — Tests Gunicorn directly (if accessible)
  • HTTP check on /health/ — Tests database and cache
  • HTTP check on a static asset — Tests Nginx static file serving
  • Heartbeat for Celery worker and Beat

Docker / Container Deployments

  • HTTP check on public URL — Tests the exposed container
  • Health check endpoint — Included in Docker HEALTHCHECK directive
  • Heartbeat for Celery containers
  • Response time monitoring — Container restarts cause brief latency spikes

Heroku / PaaS

  • HTTP check on app URL — Tests dyno health
  • Heartbeat for worker dynos (Celery)
  • Response time alerts — Heroku throttles idle dynos, causing cold starts
  • SSL monitoring — Heroku uses shared certificates; custom domain certs need monitoring

Cloud (AWS, GCP, Azure) with Auto-Scaling

  • Multi-region HTTP checks — Verify the load balancer distributes correctly
  • Health endpoint — Load balancer health checks should use the /health/ endpoint
  • Heartbeat for SQS/Pub-Sub-based Celery workers
  • Response time — Auto-scaling lag causes temporary performance degradation

Practical Setup

Minimum for every Django app

  1. HTTP check on homepage — 1-minute interval, content validation
  2. HTTP check on /health/ — Validates DB and cache connectivity
  3. Celery worker heartbeat — Every 5 minutes
  4. Celery Beat heartbeat — Every 5 minutes (separate from worker)
  5. SSL monitoring on all domains

Comprehensive setup

All of the above, plus:

  1. HTTP checks on critical API routes — With response body validation
  2. HTTP check on a static asset — Catch collectstatic failures
  3. Post-deploy validation heartbeat — Confirms deploy completed cleanly
  4. Response time alerts — Detect database and cache performance regressions
  5. Multi-region checks — Verify the app works globally
  6. DNS monitoring — Catch domain misconfiguration

How Webalert Helps

Webalert monitors your Django application across every layer:

  • 60-second HTTP checks from global regions — catch Gunicorn failures fast
  • Content validation — verify pages return correct content, not Django error pages
  • Heartbeat monitoring — track Celery workers, Beat scheduler, and management commands
  • SSL monitoring — catch certificate issues before they block users
  • Response time tracking — detect database and cache performance regressions
  • DNS monitoring — verify domain resolution
  • Multi-channel alerts — Email, SMS, Slack, Discord, Teams, webhooks

See features and pricing for details.


Summary

  • Django apps have multiple layers beyond the web — Celery workers, Beat scheduler, database, cache, and static files.
  • HTTP uptime checks only cover Gunicorn and the view layer. Use heartbeat monitoring for Celery.
  • A /health/ endpoint should test database and cache connectivity, not just that Django boots.
  • Run collectstatic and migrations as part of every deployment, then validate with a post-deploy check.
  • Monitor Celery worker and Beat scheduler separately — both can fail independently.
  • Start with homepage + health endpoint + worker heartbeat + Beat heartbeat + SSL.

Your views handle requests. Monitoring proves the entire application is working.


Monitor every layer of your Django stack

Start monitoring with Webalert →

See features and pricing. No credit card required.

Monitor your website in under 60 seconds — no credit card required.

Start Free Monitoring

Written by

Webalert Team

The Webalert team is dedicated to helping businesses keep their websites online and their users happy with reliable monitoring solutions.

Ready to Monitor Your Website?

Start monitoring for free with 3 monitors, 10-minute checks, and instant alerts.

Start Free Monitoring