December 20, 2025
Paging Panic vs Error Etiquette
Log level 'error' should mean that something needs to be fixed
Stop crying wolf in logs—save ERROR for real breakages
TLDR: A sysadmin called out a tool for labeling routine events as “error,” sparking a fight over what ERROR should mean. Most commenters want ERROR reserved for fix-now failures, while others say it depends on deployment—because false alarms make real problems harder to spot.
The internet’s ops crowd just staged an intervention for the word ERROR. After a grumpy post on the Fediverse roasted Prometheus Blackbox 0.28.0 for labeling routine checks as “error,” sysadmins rallied with pitchforks and pagers. The hottest take: if it doesn’t need fixing, it isn’t an error. Think of log levels like urgency labels: “error” means “drop everything and fix this,” not “FYI, the world is fine.” One commenter said if a program logs an error, it’s crash-worthy—only spared because it’s inside a bigger system that refuses to go down. Drama meter: high.
Then came the split. The “zero tolerance” camp declared that any error should page them at 3 a.m. (“cry wolf and you’re off my servers”). Another camp, led by folks listing examples like database timeouts and downstream service meltdowns, argued those aren’t local defects—log them as warnings or info, track metrics, and move on. A third group added nuance: sometimes a “can’t reach the remote server” is absolutely your problem, depending on how your system is run. Memes flew: “ERROR is for fire alarms, not weather reports,” “Pager PTSD,” and “stop turning logs into DEFCON 1.” Verdict? Keep ERROR sacred or watch real issues drown in noise.
Key Points
- •Error-level logs should indicate fixable, program-level faults affecting the local system’s operation.
- •Routine or external operation-level failures should not be logged as errors; use warning or info instead.
- •The author declines to upgrade Prometheus Blackbox from v0.27.0 to v0.28.0 due to error-level logging of routine events and has reported the issue.
- •Programs working as designed should not emit error-level messages during normal operation.
- •If logs are primarily for debugging, provide options to disable them to avoid polluting system logs.