# Development guide How to build, test, and contribute to the vetting orchestrator and agent. ## Prerequisites | Tool | Version | Notes | |------|---------|-------| | Go | 1.22+ | Pure Go — no cgo required. | | templ | latest | `go install github.com/a-h/templ/cmd/templ@latest` | | make | any | GNU Make on Linux/macOS/WSL; `make` ships with Git for Windows. | | mkosi | 25.3+ | Only needed for `make live-image`. Linux/WSL only. | Windows hosts can build and test everything except `live-image` and `e2e`. Those targets require a real Linux userspace — use WSL: `wsl make live-image`. ## Repository structure ``` cmd/ vetting/ orchestrator binary — HTTP server, dispatcher, runner vetting-agent/ agent binary — dual-mode (live-image + host-mode) internal/ config/ YAML loader, ProfileRegistry (quick/deep/soak) db/ SQLite open + embedded migrations (pure Go via modernc.org/sqlite) model/ Plain structs: Host, Run, Stage, SubStep, Measurement, SpecDiff store/ Repository layer — hand-written SQL, no ORM orchestrator/ State machine, dispatcher, runner, WoL, HMAC tokens, iperf supervisor api/ HTTP handlers — agent_handlers.go + ui_handlers.go httpserver/ chi router assembly (exists to break api ↔ orchestrator import cycle) web/ Embedded static assets + compiled Templ templates pxe/ dnsmasq subprocess supervisor + per-MAC iPXE script generator events/ In-process SSE hub (fan-out to browser clients) logs/ Per-run flat-file writer + SSE fan-out spec/ Expected-vs-actual hardware diff engine notify/ Pluggable notifier registry (ntfy, Discord, SMTP) report/ HTML + JSON report generation hold/ Per-run SSH key issuance for FailedHolding janitor/ Retention-based cleanup (artifact + log files) agent/ runner.go In-image agent: claim loop, stage dispatch, heartbeat, log forwarder client.go HTTP client for orchestrator API sensor_mux.go Thermal + performance metric sidecar bootstate/ Kernel cmdline parser (run_id, mac, orchestrator_url, token) hostmode/ Persistent host-mode reporter (systemd service) probes/ Hardware interrogation (lshw, dmidecode, smartctl, etc.) tests/ Per-stage test implementations live-image/ mkosi config + scripts for Debian live image deploy/ systemd unit, install.sh, pxe-setup.sh, example config docs/ You are here test/e2e/ Build-tagged QEMU + PXE full-stack integration test ``` **Key architectural insight:** `internal/httpserver` exists solely to break the `api ↔ orchestrator` import cycle. The `internal/` tree is the orchestrator binary's code; the `agent/` tree is the agent binary's code. They share only `internal/model` (plain structs) and `internal/spec` (diff engine, used by the agent's inventory probe and the orchestrator's SpecValidate resolver). ## Building | Target | Command | Description | |--------|---------|-------------| | Everything | `make all` | Build orchestrator + agent for host OS. | | Orchestrator | `make orchestrator` | Host OS binary (`bin/vetting`). | | Orchestrator (Linux) | `make orchestrator-linux` | Cross-compile to `bin/vetting-linux-amd64`. | | Agent | `make agent` | Host OS binary (dev/testing only). | | Agent (Linux) | `make agent-linux` | Cross-compile to `bin/vetting-agent.linux-amd64`. | | Templates | `make templ` | Regenerate `.templ` → `.go` files. Run before build if templates changed. | | Live image | `make live-image` | Build Debian live image via mkosi (Linux/WSL only). | | Release bundle | `make release` | Slim tarball: binaries + deploy scripts + VERSION pointer. | | Tidy | `make tidy` | `go mod tidy`. | | Format | `make fmt` | `go fmt ./...`. | | Lint | `make vet` | `go vet ./...`. | | Clean | `make clean` | Remove `bin/`, `build/`, `tmp/`, `out/`, `dist/`. | Build flags: the git SHA is baked into the binary via `-ldflags -X vetting/internal/version.GitSHA=`. ## Running locally ```bash make run # → builds orchestrator, launches with deploy/vetting.example.yaml # → http://localhost:8080 ``` The example config binds to `127.0.0.1:8080`, disables PXE, and uses `./var/` relative paths for the database, artifacts, and logs. Edit `deploy/vetting.example.yaml` to tune for your dev environment. For a QEMU walkthrough (register a host, PXE-boot a VM, watch the pipeline), see [operations.md § First vetting run](operations.md#first-vetting-run). ## Testing | Command | What it does | |---------|--------------| | `make test` | Unit + smoke tests across all packages. Cross-platform. | | `make test-race` | Same tests with Go's race detector (`-race -count=1`). | | `make vet` | `go vet ./...` — catches common mistakes. | | `make e2e` | QEMU + PXE full-stack integration test. Requires Linux root, a built live image, and a running orchestrator with a registered host and queued run. | **Test design:** - Tests use real SQLite (in-memory or temp file) — no mocking the database. - The `agent/tests/fakes/` directory contains mock binaries (`dmidecode`, `stress-ng`, etc.) used by agent probe tests. - E2E tests are build-tagged with `-tags=e2e` and live in `test/e2e/qemu_test.go`. ## Adding a new test stage 1. Add a `State` constant to `internal/model/model.go`. 2. Wire it into `internal/orchestrator/statemachine.go` — both the forward transition table and the stage-for-state lookup. 3. Add the stage name to `DefaultStages()` in `internal/config/profiles.go`. 4. Add a `case "":` to the `runStage` switch in `agent/runner.go`. 5. Drop the implementation into `agent/tests/.go`. 6. If the stage is **orchestrator-owned** (like SpecValidate or Reporting), add a `resolve` helper to `internal/api/agent_handlers.go` and call it from `resultAdvance`. 7. Add the stage to `vetting.stages` in `deploy/vetting.example.yaml`. See [test-suite.md](test-suite.md) for what each existing stage measures and its pass/fail criteria. ## Adding a new notifier 1. Implement the `notify.Notifier` interface (single `Send` method) in a new file under `internal/notify/`. 2. Register the new type in the notifier builder (the switch in `internal/notify/build.go` or equivalent factory). 3. Add the type-specific config fields to the `Notifier` struct in `internal/config/config.go`. 4. Document the new notifier type in [configuration.md § notifiers](configuration.md#notifiers). ## Code conventions - **No cgo** — the SQLite driver is `modernc.org/sqlite` (pure Go). Builds cross-compile to Linux from Windows/macOS without a C toolchain. - **Hand-written SQL** — no ORM. Queries are explicit and testable. Each store method is a single SQL statement or a short transaction. - **Templ for UI** — `.templ` files compile to type-safe Go functions. The report module uses `html/template` instead (self-contained HTML with inlined CSS). - **chi for routing** — `github.com/go-chi/chi/v5`. Standard middleware stack: `RealIP`, `Recoverer`, `Logger`. - **Error handling** — fail-soft in SSE/tile paths (log and skip), fail-hard in store/migration paths (return error up). - **Log convention** — `log.Printf` with a context prefix (e.g. `"claim: seed stages run %d: %v"`). ## CI/CD Three Gitea Actions workflows in `.gitea/workflows/`: | Workflow | Trigger | What it does | |----------|---------|--------------| | `ci.yml` | Push to main + PRs | Templ generate, tidy check, vet, build (native + linux), test with race detector + coverage. | | `release.yml` | Push to main (skips doc/test paths) | Detects `live-image/VERSION` changes → builds + publishes live image to registry. Always builds slim bundle → publishes to `vetting/latest/`. | | `e2e.yml` | Manual dispatch | Builds live image + orchestrator, installs QEMU + deps, runs `make e2e`. | **Release bundle structure:** ``` vetting-bundle/ bin/ vetting-linux-amd64 vetting-agent.linux-amd64 live-image/ VERSION # pointer — actual vmlinuz/initrd.img fetched on install install.sh pxe-setup.sh vetting.service vetting.production.yaml ipxe-shas.txt VERSION # git SHA ``` The ~30 MB bundle is published on every push to main. The ~300 MB live image (`vmlinuz` + `initrd.img`) is published separately under `live-image//` and only rebuilds when `live-image/VERSION` changes.