Alerts
dockmesh evaluates alert rules against the metrics pipeline every 30 seconds. When a rule fires, notifications go out via the configured channels.
Rule anatomy
Section titled “Rule anatomy”A rule has:
| Field | Example |
|---|---|
| Name | Prod CPU > 80% |
| Metric | cpu_percent, memory_percent, container_restarts, disk_used_percent, … |
| Scope | Container / Stack / Host / Host tag |
| Condition | > 80, < 10, == 0, increased by 5 in 10m |
| Window | 5m, 15m, 1h |
| Severity | info, warning, critical |
| Cooldown | Time between re-alerts on same resource |
| Channels | Where to send |
Creating a rule
Section titled “Creating a rule”Alerts → Rules → New rule walks through the fields. A live preview shows how many resources currently match the scope and would trigger.
Example: “Alert if any container in stacks tagged prod restarts more than 3 times in 10 minutes, notify Slack + PagerDuty, cooldown 30m, severity critical.”
Severity levels
Section titled “Severity levels”Each severity has its own icon, color, and default cooldown:
| Level | Color | Default cooldown |
|---|---|---|
| Info | Blue | 4h |
| Warning | Amber | 30m |
| Critical | Red | 5m |
Channels can be filtered by severity — e.g. Slack gets all, PagerDuty only critical.
Notification channels
Section titled “Notification channels”Built-in channels (Settings → Channels):
- Email — SMTP host + credentials, supports STARTTLS
- Slack — incoming-webhook URL
- Discord — webhook URL
- Microsoft Teams — Incoming Webhook connector URL
- ntfy.sh — topic URL, optional auth
- Gotify — server URL + app token
- Generic webhook — POST JSON to any URL
- PagerDuty — Events API v2 integration key. Dedup-key is derived from rule+container so repeated fires fold into one PD incident.
- Pushover — app_token + user_key, optional device + sound. Critical alerts map to priority 1 (visual alert); emergency priority 2 is intentionally not exposed (would need ack-handling the UI doesn’t have yet).
See individual integration guides for per-channel setup. Telegram is not built-in — use the generic webhook against the Telegram Bot API if you need it.
Cooldown
Section titled “Cooldown”Without cooldown, a container that keeps crashing would spam alerts every 30 seconds. Cooldown suppresses duplicate alerts on the same resource for the configured window. When the underlying state clears and re-fires, you get a new alert.
Mute rules
Section titled “Mute rules”Alerts → Mute rules lets you temporarily silence alerts matching a filter:
- Mute everything on host
prod-01for 2 hours (during maintenance) - Mute warnings from stack
experimental-appuntil Friday - Mute specific rule permanently (equivalent to disabling it)
Mutes are a separate concept from disabling — disabled rules don’t evaluate at all; muted rules evaluate but don’t notify.
Alert history
Section titled “Alert history”Alerts → History shows every alert fired, with:
- Timestamp
- Rule name
- Resource (container / stack / host)
- Severity
- Value that tripped the threshold
- Which channels received it
- Resolution time (if auto-resolved)
Export to CSV for compliance or post-mortems.
Built-in rules
Section titled “Built-in rules”dockmesh ships with four container-level defaults on first install so every new deployment has coverage from day one:
| Rule | Metric | Threshold | Duration | Severity |
|---|---|---|---|---|
| Container CPU > 90% (sustained) | cpu_percent | gt 90 | 5 min | warning |
| Container CPU > 95% (critical) | cpu_percent | gt 95 | 15 min | critical |
| Container memory > 90% | mem_percent | gt 90 | 5 min | warning |
| Container memory > 98% (near-OOM) | mem_percent | gt 98 | 60s | critical |
Built-in rules are flagged with a “built-in” badge in the Alerts table. They can be edited (change threshold, duration, mute, attach channels) and disabled, but not deleted — disabling them is the supported way to opt out. Deletion returns 409 Conflict from the API.
Host-level rules (disk, agent-offline, backup-job-failed) need per-host metrics that aren’t emitted yet — they’ll ship with follow-up slices that add the collectors.
See also
Section titled “See also”- Integrations · Slack — webhook setup
- Integrations · Discord — webhook setup
- Integrations · Telegram — bot setup
- Integrations · ntfy.sh — self-hosted push