Files
Vetting/docs/development.md
T
josh 8367ec2a9f
CI / Lint + build + test (push) Successful in 1m36s
Release / detect (push) Successful in 5s
Release / build-live-image (push) Has been skipped
Release / bundle (push) Successful in 49s
docs: comprehensive documentation expansion
Add 4 new doc files (configuration reference, development guide, API
reference with full request/response schemas, database schema), expand
the README with a feature list and how-it-works walkthrough, fix
missing Firmware and Burn stages in architecture.md and test-suite.md,
add threshold engine and host-mode agent sections, and add godoc
comments to 11 packages and 6 model types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-23 18:37:26 -04:00

8.5 KiB

Development guide

How to build, test, and contribute to the vetting orchestrator and agent.

Prerequisites

Tool Version Notes
Go 1.22+ Pure Go — no cgo required.
templ latest go install github.com/a-h/templ/cmd/templ@latest
make any GNU Make on Linux/macOS/WSL; make ships with Git for Windows.
mkosi 25.3+ Only needed for make live-image. Linux/WSL only.

Windows hosts can build and test everything except live-image and e2e. Those targets require a real Linux userspace — use WSL: wsl make live-image.

Repository structure

cmd/
  vetting/              orchestrator binary — HTTP server, dispatcher, runner
  vetting-agent/        agent binary — dual-mode (live-image + host-mode)
internal/
  config/               YAML loader, ProfileRegistry (quick/deep/soak)
  db/                   SQLite open + embedded migrations (pure Go via modernc.org/sqlite)
  model/                Plain structs: Host, Run, Stage, SubStep, Measurement, SpecDiff
  store/                Repository layer — hand-written SQL, no ORM
  orchestrator/         State machine, dispatcher, runner, WoL, HMAC tokens, iperf supervisor
  api/                  HTTP handlers — agent_handlers.go + ui_handlers.go
  httpserver/           chi router assembly (exists to break api ↔ orchestrator import cycle)
  web/                  Embedded static assets + compiled Templ templates
  pxe/                  dnsmasq subprocess supervisor + per-MAC iPXE script generator
  events/               In-process SSE hub (fan-out to browser clients)
  logs/                 Per-run flat-file writer + SSE fan-out
  spec/                 Expected-vs-actual hardware diff engine
  notify/               Pluggable notifier registry (ntfy, Discord, SMTP)
  report/               HTML + JSON report generation
  hold/                 Per-run SSH key issuance for FailedHolding
  janitor/              Retention-based cleanup (artifact + log files)
agent/
  runner.go             In-image agent: claim loop, stage dispatch, heartbeat, log forwarder
  client.go             HTTP client for orchestrator API
  sensor_mux.go         Thermal + performance metric sidecar
  bootstate/            Kernel cmdline parser (run_id, mac, orchestrator_url, token)
  hostmode/             Persistent host-mode reporter (systemd service)
  probes/               Hardware interrogation (lshw, dmidecode, smartctl, etc.)
  tests/                Per-stage test implementations
live-image/             mkosi config + scripts for Debian live image
deploy/                 systemd unit, install.sh, pxe-setup.sh, example config
docs/                   You are here
test/e2e/               Build-tagged QEMU + PXE full-stack integration test

Key architectural insight: internal/httpserver exists solely to break the api ↔ orchestrator import cycle. The internal/ tree is the orchestrator binary's code; the agent/ tree is the agent binary's code. They share only internal/model (plain structs) and internal/spec (diff engine, used by the agent's inventory probe and the orchestrator's SpecValidate resolver).

Building

Target Command Description
Everything make all Build orchestrator + agent for host OS.
Orchestrator make orchestrator Host OS binary (bin/vetting).
Orchestrator (Linux) make orchestrator-linux Cross-compile to bin/vetting-linux-amd64.
Agent make agent Host OS binary (dev/testing only).
Agent (Linux) make agent-linux Cross-compile to bin/vetting-agent.linux-amd64.
Templates make templ Regenerate .templ.go files. Run before build if templates changed.
Live image make live-image Build Debian live image via mkosi (Linux/WSL only).
Release bundle make release Slim tarball: binaries + deploy scripts + VERSION pointer.
Tidy make tidy go mod tidy.
Format make fmt go fmt ./....
Lint make vet go vet ./....
Clean make clean Remove bin/, build/, tmp/, out/, dist/.

Build flags: the git SHA is baked into the binary via -ldflags -X vetting/internal/version.GitSHA=<sha>.

Running locally

make run
# → builds orchestrator, launches with deploy/vetting.example.yaml
# → http://localhost:8080

The example config binds to 127.0.0.1:8080, disables PXE, and uses ./var/ relative paths for the database, artifacts, and logs. Edit deploy/vetting.example.yaml to tune for your dev environment.

For a QEMU walkthrough (register a host, PXE-boot a VM, watch the pipeline), see operations.md § First vetting run.

Testing

Command What it does
make test Unit + smoke tests across all packages. Cross-platform.
make test-race Same tests with Go's race detector (-race -count=1).
make vet go vet ./... — catches common mistakes.
make e2e QEMU + PXE full-stack integration test. Requires Linux root, a built live image, and a running orchestrator with a registered host and queued run.

Test design:

  • Tests use real SQLite (in-memory or temp file) — no mocking the database.
  • The agent/tests/fakes/ directory contains mock binaries (dmidecode, stress-ng, etc.) used by agent probe tests.
  • E2E tests are build-tagged with -tags=e2e and live in test/e2e/qemu_test.go.

Adding a new test stage

  1. Add a State<Name> constant to internal/model/model.go.
  2. Wire it into internal/orchestrator/statemachine.go — both the forward transition table and the stage-for-state lookup.
  3. Add the stage name to DefaultStages() in internal/config/profiles.go.
  4. Add a case "<Name>": to the runStage switch in agent/runner.go.
  5. Drop the implementation into agent/tests/<name>.go.
  6. If the stage is orchestrator-owned (like SpecValidate or Reporting), add a resolve<Name> helper to internal/api/agent_handlers.go and call it from resultAdvance.
  7. Add the stage to vetting.stages in deploy/vetting.example.yaml.

See test-suite.md for what each existing stage measures and its pass/fail criteria.

Adding a new notifier

  1. Implement the notify.Notifier interface (single Send method) in a new file under internal/notify/.
  2. Register the new type in the notifier builder (the switch in internal/notify/build.go or equivalent factory).
  3. Add the type-specific config fields to the Notifier struct in internal/config/config.go.
  4. Document the new notifier type in configuration.md § notifiers.

Code conventions

  • No cgo — the SQLite driver is modernc.org/sqlite (pure Go). Builds cross-compile to Linux from Windows/macOS without a C toolchain.
  • Hand-written SQL — no ORM. Queries are explicit and testable. Each store method is a single SQL statement or a short transaction.
  • Templ for UI.templ files compile to type-safe Go functions. The report module uses html/template instead (self-contained HTML with inlined CSS).
  • chi for routinggithub.com/go-chi/chi/v5. Standard middleware stack: RealIP, Recoverer, Logger.
  • Error handling — fail-soft in SSE/tile paths (log and skip), fail-hard in store/migration paths (return error up).
  • Log conventionlog.Printf with a context prefix (e.g. "claim: seed stages run %d: %v").

CI/CD

Three Gitea Actions workflows in .gitea/workflows/:

Workflow Trigger What it does
ci.yml Push to main + PRs Templ generate, tidy check, vet, build (native + linux), test with race detector + coverage.
release.yml Push to main (skips doc/test paths) Detects live-image/VERSION changes → builds + publishes live image to registry. Always builds slim bundle → publishes to vetting/latest/.
e2e.yml Manual dispatch Builds live image + orchestrator, installs QEMU + deps, runs make e2e.

Release bundle structure:

vetting-bundle/
  bin/
    vetting-linux-amd64
    vetting-agent.linux-amd64
  live-image/
    VERSION                    # pointer — actual vmlinuz/initrd.img fetched on install
  install.sh
  pxe-setup.sh
  vetting.service
  vetting.production.yaml
  ipxe-shas.txt
  VERSION                      # git SHA

The ~30 MB bundle is published on every push to main. The ~300 MB live image (vmlinuz + initrd.img) is published separately under live-image/<version>/ and only rebuilds when live-image/VERSION changes.