Skip to content

Agent Protocol

Most users never need to understand the wire protocol. But when debugging connectivity, planning network architecture, or evaluating the security model, knowing how it works matters.

  • Transport: WebSocket over TLS (wss://)
  • Authentication: Mutual TLS (both sides present certs)
  • Direction: Outbound-only from the agent
  • Wire format: JSON frames inside the WebSocket — one frame envelope, request/response correlated by ID

Every dockmesh server has its own internal CA. On first boot it generates an ECDSA P-256 CA keypair and a matching server leaf cert. Both land as PEM files in the data directory next to the SQLite DB:

FilePurpose
agents-ca.crtCA public certificate, 10-year validity
agents-ca.keyCA private key (mode 0400)
agents-server.crtServer cert for the :8443 listener, 5-year validity
agents-server.keyMatching server private key (mode 0400)

No separate database encryption layer protects the CA key — its security relies on filesystem permissions (0700 on the data dir + 0400 on the key file). Back up the data directory to back up the CA.

When you add a host:

  1. Server generates a one-time bootstrap token (cryptographically random, hashed before storage) and stores it on the new agent row in pending status.
  2. The UI shows the install command — curl -fsSL https://<server>/install/agent.sh?token=<token> | sudo bash — which you paste on the target host.
  3. The install script fetches the agent binary from the same server, drops a systemd unit, and starts the agent.
  4. The agent calls the enrolment endpoint with the token. The server matches the hash, sees pending, generates an ECDSA P-256 client keypair, signs a cert with the CA (1-year validity, agent’s UUID as Common Name), and returns cert + key + CA bundle.
  5. The agent writes those into its data dir as agent.crt, agent.key, ca.crt, plus the dial URL into agent.url.
  6. The token is invalidated server-side. The agent now uses its client cert for every subsequent connection.

Token TTL: one-time use, no time-based expiry. If you don’t use it in the next 10 minutes you can still use it next week. Rotate it manually if you suspect it leaked (Agents page → host → Rotate enrolment token).

The agent dials wss://<server>:8443/connect as soon as its service starts:

  1. TLS handshake — agent presents agent.crt, server verifies against the CA, server presents agents-server.crt which the agent verifies against the pinned ca.crt.
  2. WebSocket upgrade — HTTP/1.1 Upgrade.
  3. agent.hello — agent announces itself with its UUID, version, hostname, Docker version, OS info.
  4. server.welcome — server records the connection, marks the host online.
  5. Ready — request frames flow.

On any disconnect the agent reconnects with exponential backoff. Server-side, a pingInterval of 30 seconds + a heartbeatGrace of 60 seconds mean the host flips to offline ~60s after the agent stops responding.

The wire format is JSON, not binary. Every frame looks like:

{
"type": "req.containers.list",
"id": "8d21e3f0…",
"payload": { /* type-specific JSON */ }
}
  • type — string identifier (e.g. agent.hello, req.containers.list, res.containers.list).
  • id — opaque correlation id; responses echo the matching request’s id so the server’s waiting goroutine can route the reply.
  • payload — JSON value; shape depends on type.

The choice of JSON-over-WebSocket (rather than gRPC or a binary protocol) is deliberate: easier to inspect with Wireshark or a proxy, simpler to debug from journalctl, no codegen step. The per-message overhead is negligible for the kind of small control-plane traffic dockmesh sends.

A single WebSocket carries many concurrent operations. There is no explicit “stream id” channel — multiplexing happens at the frame level: request/response frames share the connection, and longer-running streams (log tails, exec sessions, stats subscriptions) are layered on by sending frames with the same id until the requester sends a close frame.

When you close a log viewer tab in the UI the server sends a close frame; the agent stops tailing the container’s log.

dockmesh has ~40 frame types covering lifecycle + every operation. A few representative ones:

FrameDirectionPurpose
agent.helloagent → serverInitial announcement on connect
server.welcomeserver → agentAcknowledgement + initial config
agent.heartbeatagent → serverSent in response to server.ping
server.pingserver → agentEvery 30s; missing replies for 60s flip the host offline
req.containers.listserver → agentAsk the agent for its container list
req.containers.startserver → agentLifecycle action
req.images.listserver → agentResource listing
req.volume.browseserver → agentList one directory inside a named volume
req.deploy.stackserver → agentApply a compose project
req.agent.upgradeserver → agentHot-swap the agent binary

The full list lives in internal/agents/protocol.go in the source tree — that file is the single source of truth.

Client certs are issued for 1 year. There is no auto-renewal: the agent does not currently send a renewal request as expiry approaches. Long-running agents need to be re-enrolled manually before their cert expires — rotate the token, run the install one-liner again, the cert is replaced. An agent with an expired cert sees TLS handshake failures and is marked offline; no silent breakage. Auto-renewal is on the roadmap.

When you remove a host from dockmesh, the agent row is deleted (or marked revoked). On the next handshake the server checks the row in the database and refuses the connection. The agent logs the rejection and exits. No formal CRL or OCSP — the check is a single DB lookup per connection attempt, takes effect immediately.

The upgrade controller (see Upgrade Guide) dispatches a req.agent.upgrade frame to drifted agents with the download URL + checksum. The agent downloads the new binary from <server>/install/dockmesh-agent-linux-<arch>, verifies the SHA, writes it atomically, and re-executes. Open operations (log tails, exec) are interrupted but the UI’s WebSocket clients reconnect automatically and resume.

Tradition says “server-to-agent” models pull from a central server. dockmesh flips this for practical reasons:

  • No inbound firewall holes on agent hosts (NAT, corporate firewalls, home routers all work unchanged)
  • Works behind CGNAT (common in home labs)
  • Works from anywhere the agent has internet access (coffee shop, traveling laptop, edge locations)
  • Agent knows when it has Docker running — no need for server to poll

The tradeoff: the server is reachable to all agents, which means protecting it matters. See Hardening.

dockmesh deliberately picked JSON-over-WebSocket over gRPC:

  • WebSocket survives corporate proxies that strip HTTP/2 — gRPC does not always
  • Easier debugging — journalctl, Wireshark, browser DevTools, anything that reads text speaks the protocol
  • Smaller agent binary (no gRPC runtime)
  • Performance is more than sufficient for the small control-plane messages dockmesh sends; binary serialisation would be over-engineering at this scale

See Troubleshooting → Agent won’t connect.

For deep debugging, enable debug logging:

Terminal window
# On the agent
systemctl set-environment DOCKMESH_LOG_LEVEL=debug
systemctl restart dockmesh-agent
journalctl -u dockmesh-agent -f