How to Monitor a Django App in Production

Your Django app deploys successfully. Gunicorn is running. The admin panel loads. Your homepage returns 200.

But the Celery worker died 30 minutes ago. User signup emails are queued and not sending. The periodic task that syncs data from your third-party API stopped running yesterday. Nobody noticed because the web layer is completely healthy.

Django applications have the same problem as every other framework-based app: the request-response cycle is only one part of what needs to work. Celery workers, beat schedulers, database connections, cache layers, and static file serving can all fail independently while your Django views return 200.

This guide covers everything to monitor in a production Django app so you catch failures across every layer.

What Makes Django Monitoring Different

A production Django app typically has:

WSGI/ASGI server — Gunicorn, uWSGI, or Daphne serving Django views
Celery workers — Processing background tasks (emails, notifications, data jobs)
Celery Beat — Scheduling periodic tasks
Database — PostgreSQL, MySQL, or SQLite
Cache — Redis or Memcached for sessions, caching, and Celery broker
Static and media files — Served via Nginx, WhiteNoise, or a CDN
Django management commands — Custom manage.py commands run on schedule

A basic HTTP uptime check on your homepage only validates that Gunicorn is running and Django can render a view. Everything else can be broken.

What to Monitor

1) Web Endpoints and Health Check

The starting point — verify your app responds correctly:

Homepage or primary landing page — Content validation, not just status code
Login page — Verify authentication renders correctly
API endpoints — Test the routes that power your frontend or integrations
Django admin (/admin/) — Should load the login form for unauthenticated requests

Create a dedicated health check view that tests internal dependencies:

# urls.py
from django.http import JsonResponse
from django.db import connection
from django.core.cache import cache

def health_check(request):
    try:
        # Test database
        with connection.cursor() as cursor:
            cursor.execute("SELECT 1")
        
        # Test cache
        cache.set("health_check", "ok", timeout=10)
        cache.get("health_check")
        
        return JsonResponse({"status": "healthy"})
    except Exception as e:
        return JsonResponse(
            {"status": "unhealthy", "error": str(e)},
            status=503
        )

Monitor /health/ with response body validation — check for "status": "healthy", not just a 200 status code.

2) Celery Worker Monitoring

Celery workers fail silently. When they stop:

Password reset emails are not sent
User notifications queue up forever
Background data processing halts
Webhooks from third-party services are not handled

Monitor Celery workers with a heartbeat task:

# tasks.py
from celery import shared_task
import requests

@shared_task
def celery_heartbeat():
    requests.get(
        "https://heartbeat.web-alert.io/your-celery-worker-id",
        timeout=10
    )

# celery.py or settings/celery.py
from celery.schedules import crontab

app.conf.beat_schedule = {
    "celery-heartbeat": {
        "task": "myapp.tasks.celery_heartbeat",
        "schedule": 300.0,  # Every 5 minutes
    },
}

If the heartbeat does not arrive within the expected interval, the Celery worker is down or stuck.

3) Celery Beat Monitoring

Celery Beat is the scheduler that triggers periodic tasks. It runs as a separate process and can fail independently of the workers. If Beat stops:

Scheduled tasks no longer fire
Data sync jobs stop running
Report generation halts
Cleanup tasks (session expiry, cache warming) stop

Monitor Beat with its own heartbeat task — distinct from the worker heartbeat so you can distinguish which process failed:

@shared_task
def beat_heartbeat():
    requests.get(
        "https://heartbeat.web-alert.io/your-celery-beat-id",
        timeout=10
    )

# Schedule it with Beat
"beat-alive": {
    "task": "myapp.tasks.beat_heartbeat",
    "schedule": 300.0,
}

4) Database Connectivity

Django's database layer can fail from:

Connection pool exhaustion under concurrent load
Database server running out of disk space
Slow queries degrading all page performance
A migration leaving the schema in an inconsistent state
Replica lag if using read replicas

The health check endpoint above covers basic connectivity. Additionally:

Response time monitoring — Database slowness shows as increased HTTP response times
Monitor data-driven endpoints — An endpoint that queries the database fails or slows when the database has issues

5) Static and Media Files

Django's static file serving breaks in common ways:

collectstatic not run after deploy — static files are missing or stale
WhiteNoise not configured correctly — /static/ returns 404
Nginx misconfigured — media files at /media/ are not served
S3 or CDN permissions — uploaded files return 403

Monitor:

HTTP check on a known static asset — e.g., https://yourapp.com/static/css/main.css
Content validation on pages — Verify pages load correctly (broken static files affect rendering)

6) SSL and Domain

SSL certificate monitoring — Alert before expiry, critical for apps using Let's Encrypt
DNS monitoring — Verify domain resolution
HTTPS redirect — Confirm HTTP redirects to HTTPS

7) Post-Deployment Validation

Django deployments commonly break due to:

Missing environment variable in production
Migration not run after deploy
collectstatic not run
Gunicorn/uWSGI not restarted after code update
Celery workers not restarted, still running old code

After every deployment:

#!/bin/bash
# deploy.sh
python manage.py migrate --noinput
python manage.py collectstatic --noinput
supervisorctl restart gunicorn celery celerybeat

# Validate the app is healthy
curl -sf https://yourapp.com/health/ || exit 1

# Signal deploy completed
curl -fsS https://heartbeat.web-alert.io/your-deploy-id

Common Django Failure Modes

Failure	User Impact	Detection Method
Celery worker crashed	Emails, jobs, notifications stop	Heartbeat from worker task
Celery Beat stopped	Periodic tasks stop running	Heartbeat from Beat schedule
Database connection exhausted	500 errors on data-driven pages	Health endpoint + HTTP monitoring
Cache (Redis) down	Slow pages, session loss	Health endpoint + response time
Gunicorn not restarted after deploy	Old code still running	Post-deploy content validation
Missing env variable after deploy	Partial functionality, 500 errors	HTTP check + content validation
`collectstatic` not run	Broken CSS/JS, missing images	Content validation
Migration not applied	500 errors on changed schema	Post-deploy health check
SSL certificate expired	Browser blocks the site	SSL monitoring
Static storage permissions wrong	403 on static/media files	HTTP check on static asset URL
Disk full	Log write failures, gunicorn crashes	HTTP check fails with 503

Monitoring by Deployment Setup

Gunicorn + Nginx (Most Common)

Internet → Nginx (port 80/443) → Gunicorn (port 8000) → Django

HTTP check on public URL — Tests entire stack
TCP port check on port 8000 — Tests Gunicorn directly (if accessible)
HTTP check on /health/ — Tests database and cache
HTTP check on a static asset — Tests Nginx static file serving
Heartbeat for Celery worker and Beat

Docker / Container Deployments

HTTP check on public URL — Tests the exposed container
Health check endpoint — Included in Docker HEALTHCHECK directive
Heartbeat for Celery containers
Response time monitoring — Container restarts cause brief latency spikes

Heroku / PaaS

HTTP check on app URL — Tests dyno health
Heartbeat for worker dynos (Celery)
Response time alerts — Heroku throttles idle dynos, causing cold starts
SSL monitoring — Heroku uses shared certificates; custom domain certs need monitoring

Cloud (AWS, GCP, Azure) with Auto-Scaling

Multi-region HTTP checks — Verify the load balancer distributes correctly
Health endpoint — Load balancer health checks should use the /health/ endpoint
Heartbeat for SQS/Pub-Sub-based Celery workers
Response time — Auto-scaling lag causes temporary performance degradation

Practical Setup

Minimum for every Django app

HTTP check on homepage — 1-minute interval, content validation
HTTP check on /health/ — Validates DB and cache connectivity
Celery worker heartbeat — Every 5 minutes
Celery Beat heartbeat — Every 5 minutes (separate from worker)
SSL monitoring on all domains

Comprehensive setup

All of the above, plus:

HTTP checks on critical API routes — With response body validation
HTTP check on a static asset — Catch collectstatic failures
Post-deploy validation heartbeat — Confirms deploy completed cleanly
Response time alerts — Detect database and cache performance regressions
Multi-region checks — Verify the app works globally
DNS monitoring — Catch domain misconfiguration

How Webalert Helps

Webalert monitors your Django application across every layer:

60-second HTTP checks from global regions — catch Gunicorn failures fast
Content validation — verify pages return correct content, not Django error pages
Heartbeat monitoring — track Celery workers, Beat scheduler, and management commands
SSL monitoring — catch certificate issues before they block users
Response time tracking — detect database and cache performance regressions
DNS monitoring — verify domain resolution
Multi-channel alerts — Email, SMS, Slack, Discord, Teams, webhooks

See features and pricing for details.

Summary

Django apps have multiple layers beyond the web — Celery workers, Beat scheduler, database, cache, and static files.
HTTP uptime checks only cover Gunicorn and the view layer. Use heartbeat monitoring for Celery.
A /health/ endpoint should test database and cache connectivity, not just that Django boots.
Run collectstatic and migrations as part of every deployment, then validate with a post-deploy check.
Monitor Celery worker and Beat scheduler separately — both can fail independently.
Start with homepage + health endpoint + worker heartbeat + Beat heartbeat + SSL.

Your views handle requests. Monitoring proves the entire application is working.

Monitor every layer of your Django stack

Start monitoring with Webalert →

See features and pricing. No credit card required.