Remove operator auth — trust the LAN
CI / Lint + build + test (push) Failing after 5m15s

Can't log in from a fresh LXC deploy, and the service is LAN-only by
design. Rip out the whole bcrypt-password / signed-cookie session
layer: internal/auth, login templates, gen-admin-password binary +
Makefile targets, auth config block, login/logout routes and the
RequireSession middleware wrap. Agent bearer-token auth on
/api/v1/runs/{id}/* is untouched.

Operators who want a password can front the service with a reverse
proxy — noted in README and docs/operations.md.
This commit is contained in:
2026-04-17 22:31:49 -04:00
parent 273e7593bc
commit 42da48864f
19 changed files with 52 additions and 492 deletions
-1
View File
@@ -45,7 +45,6 @@ Operator browser (HTMX + SSE, admin login)
| `internal/api` | HTTP handlers: `agent_handlers.go` (the agent-facing API) and `ui_handlers.go` (HTMX fragments + SSE). |
| `internal/httpserver` | chi router assembly — lives here to avoid `api ↔ orchestrator` cyclic imports. |
| `internal/web` | Embedded static assets + compiled Templ templates. |
| `internal/auth` | Single-admin bcrypt + signed-cookie sessions. |
| `internal/pxe` | dnsmasq subprocess supervisor + per-MAC iPXE script generator. |
| `internal/events` | In-process SSE hub (fan-out to live browser clients). |
| `internal/logs` | Per-run flat-file writer + SSE fan-out of live log tail. |
+19 -18
View File
@@ -37,25 +37,18 @@ repaired nodes so DHCP and WoL work.
- disables the distro-default dnsmasq (the orchestrator supervises
its own)
The installer does **not** enable the service, because the default
config has a placeholder bcrypt password that the binary refuses to
start with.
The installer does **not** enable the service. You'll want to edit
the config first.
3. Generate an admin password hash and a session secret, then edit
`/etc/vetting/vetting.yaml`:
3. Edit `/etc/vetting/vetting.yaml`:
```
./bin/gen-admin-password 'your-password-here' # prints a bcrypt hash
openssl rand -hex 32 # prints a 64-char hex string
```
Required fields:
- `auth.admin_password_bcrypt` — the bcrypt hash
- `auth.session_secret_hex` — the 32-byte hex string
- `server.bind` — defaults to `127.0.0.1:8080`. Switch to
`0.0.0.0:8080` (or bind to a specific LAN IP) once you're ready
to expose it. There is no built-in auth — see *Exposing outside
the LAN* below.
- `server.public_url` — the URL your browser hits the LXC on
(e.g. `https://vetting.lan:8443`). This is used as the
click-through link in notifications, so it must be the *external*
URL, not the bind address.
(e.g. `http://vetting.lan:8080`). Used as the click-through link
in notifications.
4. (Optional) Configure notifiers in the same file — see the
commented-out example block for ntfy / Discord / SMTP.
@@ -79,7 +72,7 @@ Against a QEMU VM first, before you point it at real hardware:
sudo ip link set br-vetting up
```
2. In the UI at `https://<lxc>:8443`, log in and register a host:
2. In the UI at `http://<lxc>:8080`, register a host:
- Name: `qemu-test`
- MAC: `52:54:00:12:34:56`
- WoL broadcast IP: `10.77.0.255`
@@ -145,11 +138,19 @@ Retention is governed by the `artifacts.retention_days` and
`logs.retention_days` settings. DB rows (run history) are preserved
indefinitely; only on-disk files get pruned.
## Exposing outside the LAN
The orchestrator UI has no built-in auth. It's designed to live on a
trusted home LAN and trust whatever reaches it. If you want to reach
it from outside that LAN, don't expose the bind port directly — put
it behind a reverse proxy (Caddy, nginx, Traefik) that terminates TLS
and adds basic-auth or OIDC. The agent↔orchestrator bearer token
auth is independent and keeps working either way.
## Troubleshooting
| Symptom | First check |
|---|---|
| Service refuses to start with `auth.admin_password_bcrypt is the placeholder` | You didn't replace the bcrypt hash in the config. Run `gen-admin-password`. |
| PXE client gets no DHCP offer | `journalctl -u vetting` for dnsmasq errors; confirm the LXC has `CAP_NET_ADMIN` (the shipped systemd unit does); confirm the host MAC is actually registered (`sqlite3 /var/lib/vetting/vetting.db 'SELECT name, mac FROM hosts;'`). |
| Agent `/hello` never fires | Check the live image is actually loading the agent binary — SSH into the live env (use the hold key path), `systemctl status vetting-agent`. |
| Tile stuck on `Booting` | Most likely the live image booted but the agent can't reach the orchestrator. Verify `vetting.orchestrator=` in the kernel cmdline resolves from the host's network. |