Commit Graph

7 Commits

Author SHA1 Message Date
josh 1694c20b12 Host detail v2: full pipeline + per-stage logs + WoL diagnostics
CI / Lint + build + test (push) Has been cancelled
Pipeline now always renders all 13 nodes (3 pre-stage + 9 stage +
Completed), synthesising ghosts from run state when stage rows
aren't seeded yet. Makes a WaitingWoL host show the full timeline
ahead of it instead of just 4 dots.

Agent tags each log line with its stage; logs.Hub fans out to both
log-{runID} and log-{runID}-{stage} SSE events so the detail page
can show per-stage tabs with a pure-CSS radio-sibling switch. Flat
run log prepends [stage] so grep still works.

Dispatcher writes picked/sent-WoL/heartbeat lines into the per-run
log — the operator opens the detail page, sees WaitingWoL stuck,
and reads exactly what the dispatcher did and why nothing's
progressing, instead of having to tail journalctl on the LXC.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 00:38:27 -04:00
josh bb658a8435 Host detail page + pipeline timeline
CI / Lint + build + test (push) Has been cancelled
Click a tile to open /hosts/{id} — the canonical control surface per
host. Timeline renders every pre-stage, stage, and terminal node in
order, with the current one pulsing, failed ones flagged, and
downstream ones dimmed as skipped. Detail page shows summary, hold
card (when holding), all action buttons, spec diffs, a full-height
log pane, and a collapsed expected-spec YAML.

Tile slims to name, last-seen, status, and one primary action; a
CSS-overlay <a> makes the whole card clickable while buttons stay
receptive via z-index.

Runner.publishTileUpdate now also emits pipeline-{runID} fragments,
and CompleteStage wraps Stages.CompleteByName so stage completions
advance the timeline live — without this the dots only moved on
state transitions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-17 23:59:43 -04:00
josh 9b16ed80e6 Heartbeat command channel: reboot_for_vetting skips WoL
CI / Lint + build + test (push) Failing after 5m13s
When the operator clicks Start vetting and the host is heartbeating,
the heartbeat response now carries cmd=reboot_for_vetting + run_id.
The handler drives the Queued → WaitingWoL transition via the existing
state machine, so a benign race with the 2s dispatcher poll is refused
by the state machine (not double-dispatched). WaitingWoL retries for
10 minutes to cover a crashed-mid-reboot case, then falls back to
operator action.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-17 23:37:01 -04:00
josh a0c0fb114f Add host-mode heartbeat: vetting-agent host + last-seen badge
CI / Lint + build + test (push) Has been cancelled
vetting-agent gains a `host` subcommand that runs as a systemd service
installed by the quick-register one-liner, POSTing every 30s to
/api/v1/hosts/{mac}/heartbeat so the dashboard tile shows "online" or
"Nm ago" without waiting on WoL. Ships dormant client code for the
Phase 2 reboot_for_vetting command so the server can flip it on later
without a binary redeploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-17 23:34:15 -04:00
josh 8b3d9a312e Add quick-register one-liner for target-host registration
CI / Lint + build + test (push) Failing after 5m15s
Operator pastes `curl -fsSL $ORCH/register/quick.sh | sudo bash` on the
target host (pre-wipe). The script probes MAC + CPU/RAM/disks/NICs/GPUs,
emits an expected-spec YAML, and POSTs to a new LAN-trusted JSON
endpoint /api/v1/hosts. The register page shows the command prefilled
with the orchestrator URL; the manual form moves into a collapsible
"Register manually" disclosure.
2026-04-17 22:50:54 -04:00
josh 42da48864f Remove operator auth — trust the LAN
CI / Lint + build + test (push) Failing after 5m15s
Can't log in from a fresh LXC deploy, and the service is LAN-only by
design. Rip out the whole bcrypt-password / signed-cookie session
layer: internal/auth, login templates, gen-admin-password binary +
Makefile targets, auth config block, login/logout routes and the
RequireSession middleware wrap. Agent bearer-token auth on
/api/v1/runs/{id}/* is untouched.

Operators who want a password can front the service with a reverse
proxy — noted in README and docs/operations.md.
2026-04-17 22:31:49 -04:00
josh 9bb4b09a04 Initial commit: full Phases 1-6 implementation
CI / Lint + build + test (push) Has been cancelled
Post-repair hardware validation pipeline for Proxmox cluster hosts.
Go orchestrator + in-image agent + mkosi live image + bundled dnsmasq
PXE + SQLite + HTMX/SSE UI + notify registry + janitor + full docs.
2026-04-17 21:32:10 -04:00