cron monitoring background-tasks devops reliability

Cron Job Monitoring: Never Miss a Failed Background Task

Webalert Team
January 10, 2026
8 min read

Cron Job Monitoring: Never Miss a Failed Background Task

Your website is up. Your API responds. Everything looks green.

But somewhere in the background, a cron job failed silently three days ago. Your email queue stopped processing. Your database backup didn't run. Your analytics aggregation is a week behind.

You won't know until someone asks why they never got that email. Or worse — until you need that backup.

Background tasks are the silent killers of reliability. In this guide, we'll cover how to monitor them properly.


Why Cron Jobs Fail Silently

Unlike your website (which users immediately complain about), cron jobs fail in silence:

No one is watching

When a web request fails, the user sees an error. When a cron job fails, no one sees anything. The job just... doesn't happen.

Logs get buried

Your cron output probably goes to a log file that nobody checks. Or worse, to /dev/null. Errors vanish into the void.

Failures aren't always obvious

A job might "succeed" (exit code 0) but produce wrong results. It processed 0 records instead of 10,000. Technically it ran. Actually it failed.

Dependencies change

The cron job worked for months. Then someone updated a library, changed a config, or moved a file. The job started failing, but nothing alerted anyone.


The Real Cost of Missed Cron Jobs

Silent cron failures cause expensive problems:

Data processing gaps

Your daily report aggregation stops running. A week later, someone notices the dashboard is stale. Now you need to backfill a week of data — if you can.

Missed communications

Email digests, notification batches, reminder emails — all powered by background jobs. When they fail, customers think you've gone silent.

Backup failures

The worst-case scenario: your backup cron has been failing for weeks. You only discover this when you need to restore.

Stale cache/data

Cache warmers, data syncs, and cleanup jobs keep your system healthy. When they stop, performance degrades slowly — hard to trace back to a failed cron.

Compliance violations

Scheduled compliance reports, data retention jobs, audit log exports — missing these isn't just inconvenient, it's potentially illegal.


The Three Types of Cron Monitoring

There are fundamentally three ways to monitor background tasks:

1. Heartbeat monitoring (ping-based)

How it works: Your cron job pings a URL when it completes. If the ping doesn't arrive within the expected window, you get alerted.

Best for:

  • Scheduled tasks that run on a predictable schedule
  • Jobs where completion is more important than exit status
  • Simple "did it run?" monitoring

Example:

# At the end of your cron script
0 * * * * /path/to/backup.sh && curl -s https://your-monitor.com/ping/abc123

2. Exit code monitoring

How it works: A wrapper captures the job's exit code and reports success/failure.

Best for:

  • Distinguishing between "didn't run" and "ran but failed"
  • Jobs where you need to know why it failed

Example:

0 * * * * /path/to/wrapper.sh /path/to/backup.sh

3. Output/result monitoring

How it works: Monitor the actual output or side effects of the job.

Best for:

  • Jobs that can "succeed" but produce wrong results
  • Critical jobs where you need full visibility

Example: Check that the backup file exists and is larger than 0 bytes.


Setting Up Heartbeat Monitoring

The most practical approach for most teams is heartbeat monitoring. Here's how to implement it:

Step 1: Create a monitor for each critical job

You need one monitoring endpoint per cron job. Each has:

  • A unique URL to ping
  • An expected schedule (e.g., "every hour")
  • A grace period for late runs
  • Alert settings

Step 2: Add the ping to your cron jobs

At the end of your script, ping the monitoring URL:

#!/bin/bash
# backup.sh

# Your actual backup logic
pg_dump mydb > /backups/daily.sql

# Only ping if backup succeeded
if [ $? -eq 0 ]; then
    curl -fsS --retry 3 https://your-monitor.com/ping/backup-123
fi

The -fsS flags ensure curl fails silently on errors but still retries. The --retry 3 handles temporary network issues.

Step 3: Ping on start (optional)

For long-running jobs, ping when starting AND completing:

#!/bin/bash

# Signal start
curl -fsS https://your-monitor.com/ping/job-123/start

# Your job
python long_running_task.py

# Signal completion
curl -fsS https://your-monitor.com/ping/job-123

This lets you detect jobs that started but never finished.

Step 4: Handle failures explicitly

Don't let failures go unreported:

#!/bin/bash

run_backup() {
    pg_dump mydb > /backups/daily.sql
}

if run_backup; then
    curl -fsS https://your-monitor.com/ping/backup-123
else
    curl -fsS https://your-monitor.com/ping/backup-123/fail
    exit 1
fi

Cron Monitoring Best Practices

Set realistic grace periods

Your hourly job probably doesn't run at exactly :00. Account for:

  • System load variations
  • Network latency
  • Previous job overruns

A 5-10 minute grace period prevents false alarms.

Monitor job duration

A job that usually takes 5 minutes suddenly taking 2 hours is a problem — even if it eventually completes. Track duration trends.

Alert the right people

The person who gets the "backup failed" alert should be someone who can:

  • Access the server
  • Understand the job
  • Fix the problem

Don't send all alerts to a generic inbox.

Don't ignore "flapping" jobs

A job that fails, then succeeds, then fails again is telling you something. Investigate intermittent failures before they become permanent.

Test your monitoring

Deliberately fail a job to verify:

  • The alert fires
  • It goes to the right people
  • Someone knows how to fix it

Document your cron jobs

Maintain a list of all scheduled jobs with:

  • What they do
  • How often they run
  • What happens if they fail
  • How to fix common failures

Common Cron Job Failure Patterns

The dependency update break

Symptom: Job worked for months, suddenly starts failing Cause: Library update, changed API, moved file Fix: Pin dependencies, add validation checks

The silent timeout

Symptom: Job never completes, no error logged Cause: HTTP timeout, database lock, infinite loop Fix: Add timeouts, monitor job duration

The disk space failure

Symptom: Random jobs start failing Cause: Disk full, can't write temp files Fix: Monitor disk space, clean up old files

The permission change

Symptom: "Permission denied" errors Cause: File permissions changed, user modified Fix: Document required permissions, test after changes

The environment variable problem

Symptom: Works manually, fails in cron Cause: Cron has minimal environment, missing PATH/vars Fix: Set full paths, explicitly set variables in script

#!/bin/bash
export PATH=/usr/local/bin:/usr/bin:/bin
export DATABASE_URL="postgres://..."
# Now your script works in cron

What to Monitor: A Checklist

Review your scheduled tasks. Which of these do you have?

Data jobs

  • Database backups
  • Log rotation
  • Data exports/imports
  • Analytics aggregation
  • Search index updates
  • Cache warming/clearing

Communication jobs

  • Email queue processing
  • Notification dispatching
  • Report generation
  • Newsletter sending
  • Webhook retries

Maintenance jobs

  • Temp file cleanup
  • Session cleanup
  • Old data archival
  • Certificate renewal
  • Health checks

Business logic jobs

  • Subscription billing
  • Trial expiration
  • Scheduled posts/releases
  • Price updates
  • Inventory sync

If it runs on a schedule and matters to your business, it needs monitoring.


Implementing with Webalert

Webalert makes cron monitoring straightforward:

Create a heartbeat monitor

  1. Add a new monitor
  2. Select "Heartbeat" type
  3. Set your expected schedule (hourly, daily, custom)
  4. Get your unique ping URL

Add to your cron job

# Your actual job
/path/to/script.sh

# Ping on success
curl -fsS https://ping.web-alert.io/your-unique-id

Get alerted on failure

If the ping doesn't arrive within your grace period, you get notified via:

  • Email
  • SMS
  • Slack
  • Discord
  • Webhooks

No ping received = instant alert. It's that simple.

See features for full details and pricing for plans.


The Backup Cron That Saved the Day

A true story pattern we hear often:

"We set up monitoring for our database backup cron. Two weeks later, we got an alert — backup failed due to disk space. Fixed it in 10 minutes. Three days after that, our database corrupted. The backup from 10 minutes before the corruption saved us. Without that alert, we would have had no backup."

This is why you monitor cron jobs. Not because failures are common, but because when they matter, they really matter.


Final Thoughts

Your website might be up, but your business runs on background tasks. Backups, emails, data processing, cleanup — all invisible until they break.

The scariest failures are the ones you don't know about.

Add monitoring to every critical cron job. It takes 5 minutes to set up and could save your business when it counts.

Don't wait until you need that backup to find out it hasn't run in weeks.


Never miss another failed background task

Start monitoring your cron jobs free with Webalert →

Explore features or see pricing.

Free forever. Instant alerts. No more silent failures.

Written by

Webalert Team

The Webalert team is dedicated to helping businesses keep their websites online and their users happy with reliable monitoring solutions.

Ready to Monitor Your Website?

Start monitoring for free with 3 monitors, 10-minute checks, and instant alerts.

Get Started Free