Files
Vetting/docs/architecture.md
T
josh 9bb4b09a04
CI / Lint + build + test (push) Has been cancelled
Initial commit: full Phases 1-6 implementation
Post-repair hardware validation pipeline for Proxmox cluster hosts.
Go orchestrator + in-image agent + mkosi live image + bundled dnsmasq
PXE + SQLite + HTMX/SSE UI + notify registry + janitor + full docs.
2026-04-17 21:32:10 -04:00

179 lines
8.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Architecture
A single Go binary runs the orchestrator. A second Go binary runs
inside a custom Debian live image (built with mkosi) and becomes the
per-run test agent. The two talk over HTTP + SSE.
```
Operator browser (HTMX + SSE, admin login)
│ HTTPS
┌───────────────────────────────────────────────────────────────┐
│ Orchestrator LXC — single Go binary `vetting` │
│ │
│ UI (Templ) ─┬─ Agent API ─┬─ SSE hub │
│ │ │ │
│ Orchestrator core (state machine, dispatcher sem=3, │
│ stage executors, WoL sender, token issuer) │
│ │ │
│ ┌─────┴─────┬──────────┐ │
│ ▼ ▼ ▼ │
│ SQLite flat-file logs dnsmasq subprocess │
│ (DHCP+TFTP+HTTP, MAC allowlist)│
│ │
│ Janitor goroutine (retention-based cleanup) │
│ Notifier registry (ntfy/discord/smtp) │
└─────────────────────────────────────────┬─────────────────────┘
│ LAN
Host under test (×23)
PXE → iPXE → Linux live image
└─ vetting-agent (HTTP+SSE back)
```
## Packages
| Package | Purpose |
|---|---|
| `cmd/vetting` | Orchestrator entrypoint. Wires config, stores, runner, dispatcher, iperf supervisor, PXE supervisor, janitor, HTTP router. |
| `cmd/vetting-agent` | In-image agent entrypoint. Reads kernel cmdline params, starts the agent loop. |
| `internal/config` | YAML loader + types. |
| `internal/db` | SQLite open + embedded migrations. Pure Go via modernc.org/sqlite. |
| `internal/model` | Plain structs: `Host`, `Run`, `Stage`, `Measurement`, `SpecDiff`, `Artifact`. |
| `internal/store` | Repository layer; SQL is hand-written. |
| `internal/orchestrator` | State machine, dispatcher, per-run runner, WoL sender, HMAC run tokens, iperf supervisor. |
| `internal/api` | HTTP handlers: `agent_handlers.go` (the agent-facing API) and `ui_handlers.go` (HTMX fragments + SSE). |
| `internal/httpserver` | chi router assembly — lives here to avoid `api ↔ orchestrator` cyclic imports. |
| `internal/web` | Embedded static assets + compiled Templ templates. |
| `internal/auth` | Single-admin bcrypt + signed-cookie sessions. |
| `internal/pxe` | dnsmasq subprocess supervisor + per-MAC iPXE script generator. |
| `internal/events` | In-process SSE hub (fan-out to live browser clients). |
| `internal/logs` | Per-run flat-file writer + SSE fan-out of live log tail. |
| `internal/spec` | Expected-vs-actual diff engine with severity classification. |
| `internal/notify` | Pluggable notifier registry (ntfy, Discord webhook, SMTP). |
| `internal/report` | HTML + JSON report generation (html/template, self-contained). |
| `internal/hold` | Per-run SSH key issuance for `FailedHolding`. |
| `internal/janitor` | Retention-based cleanup of old artifact files + log files. |
| `agent/` | In-image agent: claim loop, stage dispatch, heartbeat, log forwarder, thermal sidecar. |
| `agent/probes` | lshw, dmidecode, smartctl, lspci, hwmon, nvidia-smi wrappers. |
| `agent/tests` | Per-stage test implementations (SMART, CPUStress, Storage, Network, GPU, PSU). |
| `live-image/` | mkosi config + postinst for the Debian live image. |
| `deploy/` | systemd unit + example config + install.sh. |
| `test/e2e/` | Build-tagged (`-tags=e2e`) QEMU + PXE full-stack test. |
## State machine
Per-run state is the single source of truth; the UI is a pure
projection of DB + event stream.
```
Registered → Queued → WaitingWoL → Booting → InventoryCheck
→ SpecValidate → SMART → CPUStress → Storage → Network
→ GPU → PSU → Reporting → Completed
any stage → Failed → FailedHolding → Released
```
Key points:
- **Transitions are table-driven** (`internal/orchestrator/statemachine.go`).
Each `(state, event) → (next, action)` is encoded once.
- **Orchestrator-owned stages resolve inside `/result`:** `SpecValidate`
and `Reporting` flip state forward as part of the preceding stage's
result handler, so the agent never sees them as "its turn".
- **Stage rows persist before SSE fan-out** — the UI can re-derive
state by reading SQLite, and an SSE reconnect mid-run just fetches
fresh tile fragments.
## Agent ↔ orchestrator protocol
```
GET /ipxe/{MAC} → per-MAC iPXE script
POST /api/v1/runs/{id}/hello → "I booted; here's my address"
POST /api/v1/runs/{id}/claim → validate token, receive stage list
POST /api/v1/runs/{id}/heartbeat → liveness ping; response carries cmd
POST /api/v1/runs/{id}/log → batch of log lines
POST /api/v1/runs/{id}/sensor → batch of measurements (thermals, throughput)
POST /api/v1/runs/{id}/result → stage result; response says next_state
POST /api/v1/runs/{id}/hold → on FailedHolding, receive authorized_key
```
Auth on every `/api/v1/*` call: the bearer token is stored as a bcrypt
hash in `runs.agent_token_hash` and compared in constant time. The
plaintext is in the kernel cmdline — unforgeable by anyone not on the
trusted bridge, because the iPXE script is issued per-MAC and the MAC
must already be in the dnsmasq allowlist.
### Heartbeat control channel
The heartbeat response carries a `cmd` field the agent acts on:
| cmd | When fired | Agent action |
|---|---|---|
| `continue` | Normal case | No-op; keep running current stage |
| `shutdown` | Run reached `Completed` | `systemctl poweroff` |
| `abort` | Run in `FailedHolding` or `Released` | Stop heartbeat loop; let the operator drive |
| `retry_stage` | Operator pressed "Override wipe" | Re-enter the named stage with `override_flags` armed |
## Safety: destructive disk tests
Four layered gates:
1. **MAC allowlist** — dnsmasq only answers DHCP for registered MACs.
2. **Signed run token** — orchestrator issues a per-run HMAC token in
the iPXE kernel cmdline; the agent submits it on `/claim` and the
orchestrator verifies before handing back the stage list.
3. **Wipe probe** — before `badblocks`, the agent scans for filesystem
signatures / LVM metadata / partition tables. Anything found →
`FailedHolding` on `Storage`. The operator explicitly clicks
**Override wipe-probe** to proceed.
4. **Device allowlist** — the agent only targets block devices matching
the inventory's `expected_disks`. USB sticks and surprise disks are
skipped.
## Notifications
Fire-and-forget. The orchestrator fires four event kinds:
| Kind | Severity | When |
|---|---|---|
| `StageFailed` | critical | Any stage returns `passed=false` |
| `SpecMismatch` | critical | `SpecValidate` finds critical diffs |
| `HoldingOpened` | critical | Agent POSTs `/hold` (operator can SSH in) |
| `RunCompleted` | info | Pipeline reaches `Completed` |
The config maps event kinds and severities to one or more notifiers
(ntfy, Discord webhook, SMTP). Each notifier gets one attempt per
event with a 10s timeout; delivery failures are logged, nothing is
persisted.
## Why a separate notify package?
Keeps the `/result` and `/hold` handlers non-blocking. Each dispatch
starts a goroutine per target; a slow ntfy server doesn't back up an
SMTP notifier or delay the HTTP response to the agent.
## Data retention
The janitor goroutine (`internal/janitor`) runs a sweep every
`janitor.interval_minutes` (default 60) and deletes:
- artifact files older than `artifacts.retention_days`, plus their
`artifacts` table rows
- log files older than `logs.retention_days`
`runs`, `hosts`, `stages`, `measurements`, `spec_diffs` rows are
**never** deleted by the janitor — host histories and aggregate
metrics survive cleanups.
## Reproducible builds
The orchestrator and agent are pure Go; `make orchestrator-linux`
cross-compiles to `linux-amd64` from Windows or macOS.
The live image requires Linux-side tooling (mkosi, debootstrap,
squashfs-tools) so `make live-image` fails loudly on Windows and
redirects to `wsl make live-image`. Pinning to snapshot.debian.org in
`live-image/mkosi.conf` keeps image bits stable across time for a
given git SHA.