
Your website is up. Your API responds. Everything looks green.
But somewhere in the background, a cron job failed silently three days ago. Your email queue stopped processing. Your database backup didn't run. Your analytics aggregation is a week behind.
You won't know until someone asks why they never got that email. Or worse — until you need that backup.
Background tasks are the silent killers of reliability. In this guide, we'll cover how to monitor them properly.
Why Cron Jobs Fail Silently
Unlike your website (which users immediately complain about), cron jobs fail in silence:
No one is watching
When a web request fails, the user sees an error. When a cron job fails, no one sees anything. The job just... doesn't happen.
Logs get buried
Your cron output probably goes to a log file that nobody checks. Or worse, to /dev/null. Errors vanish into the void.
Failures aren't always obvious
A job might "succeed" (exit code 0) but produce wrong results. It processed 0 records instead of 10,000. Technically it ran. Actually it failed.
Dependencies change
The cron job worked for months. Then someone updated a library, changed a config, or moved a file. The job started failing, but nothing alerted anyone.
The Real Cost of Missed Cron Jobs
Silent cron failures cause expensive problems:
Data processing gaps
Your daily report aggregation stops running. A week later, someone notices the dashboard is stale. Now you need to backfill a week of data — if you can.
Missed communications
Email digests, notification batches, reminder emails — all powered by background jobs. When they fail, customers think you've gone silent.
Backup failures
The worst-case scenario: your backup cron has been failing for weeks. You only discover this when you need to restore.
Stale cache/data
Cache warmers, data syncs, and cleanup jobs keep your system healthy. When they stop, performance degrades slowly — hard to trace back to a failed cron.
Compliance violations
Scheduled compliance reports, data retention jobs, audit log exports — missing these isn't just inconvenient, it's potentially illegal.
The Three Types of Cron Monitoring
There are fundamentally three ways to monitor background tasks:
1. Heartbeat monitoring (ping-based)
How it works: Your cron job pings a URL when it completes. If the ping doesn't arrive within the expected window, you get alerted.
Best for:
- Scheduled tasks that run on a predictable schedule
- Jobs where completion is more important than exit status
- Simple "did it run?" monitoring
Example:
# At the end of your cron script
0 * * * * /path/to/backup.sh && curl -s https://your-monitor.com/ping/abc123
2. Exit code monitoring
How it works: A wrapper captures the job's exit code and reports success/failure.
Best for:
- Distinguishing between "didn't run" and "ran but failed"
- Jobs where you need to know why it failed
Example:
0 * * * * /path/to/wrapper.sh /path/to/backup.sh
3. Output/result monitoring
How it works: Monitor the actual output or side effects of the job.
Best for:
- Jobs that can "succeed" but produce wrong results
- Critical jobs where you need full visibility
Example: Check that the backup file exists and is larger than 0 bytes.
Setting Up Heartbeat Monitoring
The most practical approach for most teams is heartbeat monitoring. Here's how to implement it:
Step 1: Create a monitor for each critical job
You need one monitoring endpoint per cron job. Each has:
- A unique URL to ping
- An expected schedule (e.g., "every hour")
- A grace period for late runs
- Alert settings
Step 2: Add the ping to your cron jobs
At the end of your script, ping the monitoring URL:
#!/bin/bash
# backup.sh
# Your actual backup logic
pg_dump mydb > /backups/daily.sql
# Only ping if backup succeeded
if [ $? -eq 0 ]; then
curl -fsS --retry 3 https://your-monitor.com/ping/backup-123
fi
The -fsS flags ensure curl fails silently on errors but still retries. The --retry 3 handles temporary network issues.
Step 3: Ping on start (optional)
For long-running jobs, ping when starting AND completing:
#!/bin/bash
# Signal start
curl -fsS https://your-monitor.com/ping/job-123/start
# Your job
python long_running_task.py
# Signal completion
curl -fsS https://your-monitor.com/ping/job-123
This lets you detect jobs that started but never finished.
Step 4: Handle failures explicitly
Don't let failures go unreported:
#!/bin/bash
run_backup() {
pg_dump mydb > /backups/daily.sql
}
if run_backup; then
curl -fsS https://your-monitor.com/ping/backup-123
else
curl -fsS https://your-monitor.com/ping/backup-123/fail
exit 1
fi
Cron Monitoring Best Practices
Set realistic grace periods
Your hourly job probably doesn't run at exactly :00. Account for:
- System load variations
- Network latency
- Previous job overruns
A 5-10 minute grace period prevents false alarms.
Monitor job duration
A job that usually takes 5 minutes suddenly taking 2 hours is a problem — even if it eventually completes. Track duration trends.
Alert the right people
The person who gets the "backup failed" alert should be someone who can:
- Access the server
- Understand the job
- Fix the problem
Don't send all alerts to a generic inbox.
Don't ignore "flapping" jobs
A job that fails, then succeeds, then fails again is telling you something. Investigate intermittent failures before they become permanent.
Test your monitoring
Deliberately fail a job to verify:
- The alert fires
- It goes to the right people
- Someone knows how to fix it
Document your cron jobs
Maintain a list of all scheduled jobs with:
- What they do
- How often they run
- What happens if they fail
- How to fix common failures
Common Cron Job Failure Patterns
The dependency update break
Symptom: Job worked for months, suddenly starts failing Cause: Library update, changed API, moved file Fix: Pin dependencies, add validation checks
The silent timeout
Symptom: Job never completes, no error logged Cause: HTTP timeout, database lock, infinite loop Fix: Add timeouts, monitor job duration
The disk space failure
Symptom: Random jobs start failing Cause: Disk full, can't write temp files Fix: Monitor disk space, clean up old files
The permission change
Symptom: "Permission denied" errors Cause: File permissions changed, user modified Fix: Document required permissions, test after changes
The environment variable problem
Symptom: Works manually, fails in cron Cause: Cron has minimal environment, missing PATH/vars Fix: Set full paths, explicitly set variables in script
#!/bin/bash
export PATH=/usr/local/bin:/usr/bin:/bin
export DATABASE_URL="postgres://..."
# Now your script works in cron
What to Monitor: A Checklist
Review your scheduled tasks. Which of these do you have?
Data jobs
- Database backups
- Log rotation
- Data exports/imports
- Analytics aggregation
- Search index updates
- Cache warming/clearing
Communication jobs
- Email queue processing
- Notification dispatching
- Report generation
- Newsletter sending
- Webhook retries
Maintenance jobs
- Temp file cleanup
- Session cleanup
- Old data archival
- Certificate renewal
- Health checks
Business logic jobs
- Subscription billing
- Trial expiration
- Scheduled posts/releases
- Price updates
- Inventory sync
If it runs on a schedule and matters to your business, it needs monitoring.
Implementing with Webalert
Webalert makes cron monitoring straightforward:
Create a heartbeat monitor
- Add a new monitor
- Select "Heartbeat" type
- Set your expected schedule (hourly, daily, custom)
- Get your unique ping URL
Add to your cron job
# Your actual job
/path/to/script.sh
# Ping on success
curl -fsS https://ping.web-alert.io/your-unique-id
Get alerted on failure
If the ping doesn't arrive within your grace period, you get notified via:
- SMS
- Slack
- Discord
- Webhooks
No ping received = instant alert. It's that simple.
See features for full details and pricing for plans.
The Backup Cron That Saved the Day
A true story pattern we hear often:
"We set up monitoring for our database backup cron. Two weeks later, we got an alert — backup failed due to disk space. Fixed it in 10 minutes. Three days after that, our database corrupted. The backup from 10 minutes before the corruption saved us. Without that alert, we would have had no backup."
This is why you monitor cron jobs. Not because failures are common, but because when they matter, they really matter.
Final Thoughts
Your website might be up, but your business runs on background tasks. Backups, emails, data processing, cleanup — all invisible until they break.
The scariest failures are the ones you don't know about.
Add monitoring to every critical cron job. It takes 5 minutes to set up and could save your business when it counts.
Don't wait until you need that backup to find out it hasn't run in weeks.
Never miss another failed background task
Start monitoring your cron jobs free with Webalert →
Explore features or see pricing.
Free forever. Instant alerts. No more silent failures.