Skip to content

Hardening Guide

This is a checklist for running dockmesh in production or any multi-user environment. Work through it once at setup, then revisit quarterly.

The installer creates a dedicated system user dockmesh and runs the service as that user, with docker-group membership so the daemon socket is reachable. An exploit in the HTTP or agent handlers therefore lands on a non-root account, not on uid 0.

curl … | sudo bash + the first-boot Setup Wizard sets this up automatically. Older root-owned installs are migrated to the non-root layout the next time the installer runs.

To verify:

Terminal window
systemctl show -p User --value dockmesh
# → dockmesh (good)
# → root (pre-migration — re-run the installer)
id dockmesh
# → uid=N(dockmesh) gid=N(dockmesh) groups=N(dockmesh),M(docker)

Docker-group membership ≈ root-equivalent on the host. A user in docker can launch a privileged container that mounts / and escapes. This is unavoidable when the service has to talk to the Docker daemon. Don’t treat the dockmesh account as a low-trust user — treat it as “root for this one narrow purpose”.

The dockmesh binary stays root-owned (it’s not a user program):

Terminal window
chown root:root /usr/local/bin/dockmesh
chmod 755 /usr/local/bin/dockmesh

The data directory (DOCKMESH_DB_PATH parent) should be dockmesh-owned with 700 permissions:

Terminal window
chown -R dockmesh:dockmesh /var/lib/dockmesh
chmod 700 /var/lib/dockmesh
chmod 600 /var/lib/dockmesh/data/*.db

The SQLite database contains the CA private key, session tokens, and encrypted secrets. Do not let it be world-readable.

The unit the installer writes already includes:

[Service]
User=dockmesh
Group=docker
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
ReadWritePaths=/var/lib/dockmesh /var/run/docker.sock
RestrictNamespaces=true
LockPersonality=true

Older units can be upgraded by re-running curl -fsSL https://get.dockmesh.dev | sudo bash — the installer patches the unit in-place and restarts the service.

Set a strong admin password during the Setup Wizard — there is no shipped admin/admin default. Better still: configure SSO and delete the local admin once SSO works.

Enforce 2FA for the admin role under Authentication → Sessions & sign-in flow → Require 2FA for admin role. The setting persists as auth.require_tfa_for_admin and applies from the next sign-in.

Enable the embedded Caddy reverse proxy (see Reverse Proxy) and bind dockmesh to 127.0.0.1:8080. Caddy terminates TLS on 443.

If you use an external load balancer (Cloudflare, AWS ALB, nginx), set DOCKMESH_HTTP_ADDR=127.0.0.1:8080 so dockmesh never listens on a public IP directly.

The agent protocol is mTLS by default and cannot be disabled. Rotate the CA if you suspect compromise:

Terminal window
# Snapshot the DB first — the CA private key lives inside it
sudo install -m 0600 -o dockmesh -g dockmesh /var/lib/dockmesh/data/dockmesh.db /var/lib/dockmesh/data/dockmesh.db.bak
# Rotate
sudo dockmesh ca rotate --reissue-all-agents

All agents re-enroll on next connect. Old certs are revoked.

On the server host:

PortDirectionSourcePurpose
443inPublic (if UI is internet-facing) or VPN subnetHTTPS UI
8443inAgent IPs onlyAgent mTLS
80inPublicACME challenge (only if using auto-TLS)
22inAdmin IPs onlySSH

Block everything else. Example ufw:

Terminal window
ufw default deny incoming
ufw allow from <VPN-subnet> to any port 443
ufw allow from <agent-subnet> to any port 8443
ufw allow from <admin-ip> to any port 22
ufw enable

Agents make outbound-only connections. No inbound ports needed. If you have an inbound firewall rule for them, remove it.

If stacks are Git-backed, don’t commit plaintext passwords. Options:

  • SOPS with age encryption — files are encrypted in Git, decrypted at deploy time
  • Docker native secrets with an external store (Vault, AWS Secrets Manager)
  • dockmesh env var secrets — encrypted at rest in the dockmesh DB, never in Git

Rotate these on a schedule:

SecretReal lifetimeNotes
Agent mTLS certs1 year, manual rotationNo auto-renewal yet — drift visible in the Hosts page; re-enrol the agent before expiry. Auto-renewal is on the roadmap.
API tokensOperator-defined expiry per tokenNo org-wide rotation forcer today — set realistic TTLs at creation.
Refresh tokensConfigurable in Authentication → Sessions (default 24h absolute, 60min idle)Access tokens are 15min JWTs auto-refreshed by the UI / dmctl.
SSO client secretsPer-IdP policyUpdate the value in the provider record under Authentication.
SMTP passwordOn staff changeNotification-channel credentials are encrypted at rest with the same age key as stack .env.

Use age-encrypted backups (Backup docs). Store the passphrase in a password manager separate from the dockmesh instance.

A backup you haven’t restored isn’t a backup. Schedule a quarterly test restore to a scratch host.

At least one backup target should be off the same host (and ideally off the same provider). Local backup + S3/B2 is the common pattern.

Enrollment tokens are powerful. Treat them like root SSH keys:

  • One-time use (dockmesh enforces this)
  • Transmit over encrypted channel (SSH, 1Password shared vault, Signal)
  • Never commit to Git or pipeline config
  • Rotate the server’s enrollment-signing key annually

dockmesh doesn’t replace Docker security best practices:

  • Run containers as non-root where possible (USER in Dockerfile)
  • Drop capabilities (cap_drop: [ALL], add only what’s needed)
  • Use read-only root filesystem (read_only: true + explicit write volumes)
  • Set resource limits (cpus, mem_limit) on every container
  • Use no-new-privileges: true in compose

The image-scanner in dockmesh catches known CVEs but not runtime misconfigurations. Separate runtime scanner (Falco, Tracee) for that.

  • Enable the Audit Log webhook to ship events to your SIEM
  • Alert on: failed SSO attempts, mTLS handshake failures, audit chain breaks
  • Review the audit log weekly for unexpected admin actions

Quarterly:

  • Who has Admin? Why?
  • Which API tokens exist? Who owns each? Still needed?
  • Expired or unused role assignments
  • Hosts that haven’t connected in 30 days (stale agents)
  • Stacks running on EOL’d base images