Files
Vetting/README.md
T
josh 42da48864f
CI / Lint + build + test (push) Failing after 5m15s
Remove operator auth — trust the LAN
Can't log in from a fresh LXC deploy, and the service is LAN-only by
design. Rip out the whole bcrypt-password / signed-cookie session
layer: internal/auth, login templates, gen-admin-password binary +
Makefile targets, auth config block, login/logout routes and the
RequireSession middleware wrap. Agent bearer-token auth on
/api/v1/runs/{id}/* is untouched.

Operators who want a password can front the service with a reverse
proxy — noted in README and docs/operations.md.
2026-04-17 22:31:49 -04:00

100 lines
3.5 KiB
Markdown

# Vetting
Post-repair hardware validation pipeline for Proxmox cluster hosts.
Register a host, click **Start Vetting**, and the orchestrator will
PXE-boot it into a custom Linux live image and run it through a
consistent battery of tests (CPU stress, RAM stress, SMART, disk I/O,
network throughput, GPU, PSU telemetry). Pass → auto-shutdown + HTML
report. Fail → pipeline halts, SSH drops in, notification fires.
Built for solo-operator home labs: one Go binary, SQLite + flat files,
HTMX + SSE UI, bundled dnsmasq, optional ntfy / Discord / SMTP
notifications.
## Documentation
- [docs/operations.md](docs/operations.md) — install + first run +
troubleshooting
- [docs/architecture.md](docs/architecture.md) — packages, state
machine, protocol
- [docs/test-suite.md](docs/test-suite.md) — what each stage measures
## Quick start (local, against QEMU)
```bash
make all
./bin/vetting --config deploy/vetting.example.yaml
# → http://localhost:8080
```
The UI has no built-in auth — bind to loopback or LAN only, or front
the service with a reverse proxy (Caddy/nginx basic-auth) if you
want a password. The agent↔orchestrator channel keeps its own
bearer-token auth and is unaffected.
For a full end-to-end QEMU walk-through (bridge setup, host registration,
PXE boot), see [docs/operations.md § First vetting run](docs/operations.md#first-vetting-run).
## Production install (Proxmox LXC)
On a fresh Debian/Ubuntu LXC, as root:
```bash
curl -fsSL https://gitea.thewrightserver.net/josh/Vetting/raw/branch/main/deploy/proxmox-install.sh | bash
```
That installs Go (if missing), clones the repo to `/opt/vetting-src`,
builds `vetting-linux-amd64`, and hands off to `deploy/install.sh`
which lays down the binary, systemd unit, example config, and
`vetting` service user. Then:
```bash
# Edit /etc/vetting/vetting.yaml (server.bind + server.public_url)
sudo systemctl enable --now vetting
journalctl -fu vetting
```
Prefer to build yourself? The manual path:
```bash
make orchestrator-linux
scp -r bin deploy lxc:/opt/vetting/
ssh lxc "cd /opt/vetting && sudo ./deploy/install.sh"
ssh lxc "sudo systemctl enable --now vetting"
```
See [docs/operations.md § Install](docs/operations.md#install-proxmox-lxc)
for the full walkthrough.
## Repository layout
```
cmd/ orchestrator + agent entrypoints
internal/ core packages (see docs/architecture.md for the map)
agent/ in-image agent logic (claim loop, stage dispatch, probes)
live-image/ mkosi config for the PXE-bootable Debian live image
deploy/ systemd unit + install.sh + example config
docs/ operator + developer docs
test/e2e/ build-tag-gated QEMU + PXE full-stack test
tools/ small CLI helpers
```
## Development
- `make test` — Go unit + smoke tests (cross-platform)
- `make vet``go vet` on the whole module
- `make live-image` — Linux-only; run under WSL from Windows
- `make e2e` — requires Linux root + live image + running orchestrator
- `make run` — build + launch the orchestrator with the example config
Windows hosts: everything except `live-image` and `e2e` works natively.
The live image build calls `mkosi` which needs a real Linux userspace,
so use WSL for those targets.
## Status
All six phases in the original plan are implemented. The E2E QEMU
harness is wired in `test/e2e/qemu_test.go` but requires a running
orchestrator + registered host + queued run as preconditions — it's a
developer-facing integration harness, not a unit test.