docs: comprehensive documentation expansion
Add 4 new doc files (configuration reference, development guide, API reference with full request/response schemas, database schema), expand the README with a feature list and how-it-works walkthrough, fix missing Firmware and Burn stages in architecture.md and test-suite.md, add threshold engine and host-mode agent sections, and add godoc comments to 11 packages and 6 model types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,193 @@
|
||||
# Development guide
|
||||
|
||||
How to build, test, and contribute to the vetting orchestrator and
|
||||
agent.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
| Tool | Version | Notes |
|
||||
|------|---------|-------|
|
||||
| Go | 1.22+ | Pure Go — no cgo required. |
|
||||
| templ | latest | `go install github.com/a-h/templ/cmd/templ@latest` |
|
||||
| make | any | GNU Make on Linux/macOS/WSL; `make` ships with Git for Windows. |
|
||||
| mkosi | 25.3+ | Only needed for `make live-image`. Linux/WSL only. |
|
||||
|
||||
Windows hosts can build and test everything except `live-image` and
|
||||
`e2e`. Those targets require a real Linux userspace — use WSL:
|
||||
`wsl make live-image`.
|
||||
|
||||
## Repository structure
|
||||
|
||||
```
|
||||
cmd/
|
||||
vetting/ orchestrator binary — HTTP server, dispatcher, runner
|
||||
vetting-agent/ agent binary — dual-mode (live-image + host-mode)
|
||||
internal/
|
||||
config/ YAML loader, ProfileRegistry (quick/deep/soak)
|
||||
db/ SQLite open + embedded migrations (pure Go via modernc.org/sqlite)
|
||||
model/ Plain structs: Host, Run, Stage, SubStep, Measurement, SpecDiff
|
||||
store/ Repository layer — hand-written SQL, no ORM
|
||||
orchestrator/ State machine, dispatcher, runner, WoL, HMAC tokens, iperf supervisor
|
||||
api/ HTTP handlers — agent_handlers.go + ui_handlers.go
|
||||
httpserver/ chi router assembly (exists to break api ↔ orchestrator import cycle)
|
||||
web/ Embedded static assets + compiled Templ templates
|
||||
pxe/ dnsmasq subprocess supervisor + per-MAC iPXE script generator
|
||||
events/ In-process SSE hub (fan-out to browser clients)
|
||||
logs/ Per-run flat-file writer + SSE fan-out
|
||||
spec/ Expected-vs-actual hardware diff engine
|
||||
notify/ Pluggable notifier registry (ntfy, Discord, SMTP)
|
||||
report/ HTML + JSON report generation
|
||||
hold/ Per-run SSH key issuance for FailedHolding
|
||||
janitor/ Retention-based cleanup (artifact + log files)
|
||||
agent/
|
||||
runner.go In-image agent: claim loop, stage dispatch, heartbeat, log forwarder
|
||||
client.go HTTP client for orchestrator API
|
||||
sensor_mux.go Thermal + performance metric sidecar
|
||||
bootstate/ Kernel cmdline parser (run_id, mac, orchestrator_url, token)
|
||||
hostmode/ Persistent host-mode reporter (systemd service)
|
||||
probes/ Hardware interrogation (lshw, dmidecode, smartctl, etc.)
|
||||
tests/ Per-stage test implementations
|
||||
live-image/ mkosi config + scripts for Debian live image
|
||||
deploy/ systemd unit, install.sh, pxe-setup.sh, example config
|
||||
docs/ You are here
|
||||
test/e2e/ Build-tagged QEMU + PXE full-stack integration test
|
||||
```
|
||||
|
||||
**Key architectural insight:** `internal/httpserver` exists solely to
|
||||
break the `api ↔ orchestrator` import cycle. The `internal/` tree is
|
||||
the orchestrator binary's code; the `agent/` tree is the agent
|
||||
binary's code. They share only `internal/model` (plain structs) and
|
||||
`internal/spec` (diff engine, used by the agent's inventory probe and
|
||||
the orchestrator's SpecValidate resolver).
|
||||
|
||||
## Building
|
||||
|
||||
| Target | Command | Description |
|
||||
|--------|---------|-------------|
|
||||
| Everything | `make all` | Build orchestrator + agent for host OS. |
|
||||
| Orchestrator | `make orchestrator` | Host OS binary (`bin/vetting`). |
|
||||
| Orchestrator (Linux) | `make orchestrator-linux` | Cross-compile to `bin/vetting-linux-amd64`. |
|
||||
| Agent | `make agent` | Host OS binary (dev/testing only). |
|
||||
| Agent (Linux) | `make agent-linux` | Cross-compile to `bin/vetting-agent.linux-amd64`. |
|
||||
| Templates | `make templ` | Regenerate `.templ` → `.go` files. Run before build if templates changed. |
|
||||
| Live image | `make live-image` | Build Debian live image via mkosi (Linux/WSL only). |
|
||||
| Release bundle | `make release` | Slim tarball: binaries + deploy scripts + VERSION pointer. |
|
||||
| Tidy | `make tidy` | `go mod tidy`. |
|
||||
| Format | `make fmt` | `go fmt ./...`. |
|
||||
| Lint | `make vet` | `go vet ./...`. |
|
||||
| Clean | `make clean` | Remove `bin/`, `build/`, `tmp/`, `out/`, `dist/`. |
|
||||
|
||||
Build flags: the git SHA is baked into the binary via
|
||||
`-ldflags -X vetting/internal/version.GitSHA=<sha>`.
|
||||
|
||||
## Running locally
|
||||
|
||||
```bash
|
||||
make run
|
||||
# → builds orchestrator, launches with deploy/vetting.example.yaml
|
||||
# → http://localhost:8080
|
||||
```
|
||||
|
||||
The example config binds to `127.0.0.1:8080`, disables PXE, and uses
|
||||
`./var/` relative paths for the database, artifacts, and logs. Edit
|
||||
`deploy/vetting.example.yaml` to tune for your dev environment.
|
||||
|
||||
For a QEMU walkthrough (register a host, PXE-boot a VM, watch the
|
||||
pipeline), see [operations.md § First vetting run](operations.md#first-vetting-run).
|
||||
|
||||
## Testing
|
||||
|
||||
| Command | What it does |
|
||||
|---------|--------------|
|
||||
| `make test` | Unit + smoke tests across all packages. Cross-platform. |
|
||||
| `make test-race` | Same tests with Go's race detector (`-race -count=1`). |
|
||||
| `make vet` | `go vet ./...` — catches common mistakes. |
|
||||
| `make e2e` | QEMU + PXE full-stack integration test. Requires Linux root, a built live image, and a running orchestrator with a registered host and queued run. |
|
||||
|
||||
**Test design:**
|
||||
|
||||
- Tests use real SQLite (in-memory or temp file) — no mocking the
|
||||
database.
|
||||
- The `agent/tests/fakes/` directory contains mock binaries
|
||||
(`dmidecode`, `stress-ng`, etc.) used by agent probe tests.
|
||||
- E2E tests are build-tagged with `-tags=e2e` and live in
|
||||
`test/e2e/qemu_test.go`.
|
||||
|
||||
## Adding a new test stage
|
||||
|
||||
1. Add a `State<Name>` constant to `internal/model/model.go`.
|
||||
2. Wire it into `internal/orchestrator/statemachine.go` — both the
|
||||
forward transition table and the stage-for-state lookup.
|
||||
3. Add the stage name to `DefaultStages()` in
|
||||
`internal/config/profiles.go`.
|
||||
4. Add a `case "<Name>":` to the `runStage` switch in
|
||||
`agent/runner.go`.
|
||||
5. Drop the implementation into `agent/tests/<name>.go`.
|
||||
6. If the stage is **orchestrator-owned** (like SpecValidate or
|
||||
Reporting), add a `resolve<Name>` helper to
|
||||
`internal/api/agent_handlers.go` and call it from `resultAdvance`.
|
||||
7. Add the stage to `vetting.stages` in
|
||||
`deploy/vetting.example.yaml`.
|
||||
|
||||
See [test-suite.md](test-suite.md) for what each existing stage
|
||||
measures and its pass/fail criteria.
|
||||
|
||||
## Adding a new notifier
|
||||
|
||||
1. Implement the `notify.Notifier` interface (single `Send` method)
|
||||
in a new file under `internal/notify/`.
|
||||
2. Register the new type in the notifier builder (the switch in
|
||||
`internal/notify/build.go` or equivalent factory).
|
||||
3. Add the type-specific config fields to the `Notifier` struct in
|
||||
`internal/config/config.go`.
|
||||
4. Document the new notifier type in
|
||||
[configuration.md § notifiers](configuration.md#notifiers).
|
||||
|
||||
## Code conventions
|
||||
|
||||
- **No cgo** — the SQLite driver is `modernc.org/sqlite` (pure Go).
|
||||
Builds cross-compile to Linux from Windows/macOS without a C
|
||||
toolchain.
|
||||
- **Hand-written SQL** — no ORM. Queries are explicit and testable.
|
||||
Each store method is a single SQL statement or a short transaction.
|
||||
- **Templ for UI** — `.templ` files compile to type-safe Go functions.
|
||||
The report module uses `html/template` instead (self-contained HTML
|
||||
with inlined CSS).
|
||||
- **chi for routing** — `github.com/go-chi/chi/v5`. Standard
|
||||
middleware stack: `RealIP`, `Recoverer`, `Logger`.
|
||||
- **Error handling** — fail-soft in SSE/tile paths (log and skip),
|
||||
fail-hard in store/migration paths (return error up).
|
||||
- **Log convention** — `log.Printf` with a context prefix
|
||||
(e.g. `"claim: seed stages run %d: %v"`).
|
||||
|
||||
## CI/CD
|
||||
|
||||
Three Gitea Actions workflows in `.gitea/workflows/`:
|
||||
|
||||
| Workflow | Trigger | What it does |
|
||||
|----------|---------|--------------|
|
||||
| `ci.yml` | Push to main + PRs | Templ generate, tidy check, vet, build (native + linux), test with race detector + coverage. |
|
||||
| `release.yml` | Push to main (skips doc/test paths) | Detects `live-image/VERSION` changes → builds + publishes live image to registry. Always builds slim bundle → publishes to `vetting/latest/`. |
|
||||
| `e2e.yml` | Manual dispatch | Builds live image + orchestrator, installs QEMU + deps, runs `make e2e`. |
|
||||
|
||||
**Release bundle structure:**
|
||||
|
||||
```
|
||||
vetting-bundle/
|
||||
bin/
|
||||
vetting-linux-amd64
|
||||
vetting-agent.linux-amd64
|
||||
live-image/
|
||||
VERSION # pointer — actual vmlinuz/initrd.img fetched on install
|
||||
install.sh
|
||||
pxe-setup.sh
|
||||
vetting.service
|
||||
vetting.production.yaml
|
||||
ipxe-shas.txt
|
||||
VERSION # git SHA
|
||||
```
|
||||
|
||||
The ~30 MB bundle is published on every push to main. The ~300 MB live
|
||||
image (`vmlinuz` + `initrd.img`) is published separately under
|
||||
`live-image/<version>/` and only rebuilds when `live-image/VERSION`
|
||||
changes.
|
||||
Reference in New Issue
Block a user