Vetting

Author	SHA1	Message	Date
josh	017c3c38fe	feat(ui): 15-point UX overhaul — affordances, feedback, and navigation CI / Lint + build + test (push) Successful in 1m43s Details Release / detect (push) Successful in 6s Details Release / build-live-image (push) Has been skipped Details Release / bundle (push) Successful in 52s Details Address friction points identified in a full interface audit: - Re-add status badge to dashboard tiles so run state is visible at a glance - Add active nav indicator and SSE connection health monitor (live/stale) - Show manual registration form by default instead of hiding behind <details> - Add copy-to-clipboard buttons on SSH hold command and quick-register one-liner - Replace tooltip-only profile descriptions with inline visible text - Clarify non-destructive toggle with explicit stage impact description - Replace disabled "Start vetting" button with actionable offline guidance - Swap browser confirm() dialogs for styled inline confirmations - Add colored badge to spec diffs summary visible when collapsed - Add distinct "cancelled" mood for cancelled runs (vs idle) - Add match count to log search and aria-label for accessibility - Add styled 404 page rendered inside the app shell Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-23 20:08:07 -04:00
josh	c545028903	feat(run-page): tick the run-duration timer between SSE pushes CI / Lint + build + test (push) Successful in 1m34s Details Release / release (push) Has been cancelled Details Adds a 1s client-side ticker that rewrites .run-duration text from a data-started-at attribute, so the header timer on /runs/{id} increments every second while the run is active. When an SSE swap lands a fresh header the new server-rendered value seamlessly takes over; when the run goes terminal the template drops the attribute and the ticker silently skips the node, leaving the final elapsed in place. Other templ_*.go churn is cosmetic — regenerator versions differ between CI and local and only the filename field in templ.Error callsites changed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 21:53:40 -04:00
josh	23c689aa5b	deep profile + threshold gating + firmware stage + Burn super-stage CI / Lint + build + test (push) Failing after 1m57s Details Release / release (push) Has been cancelled Details Ships all five phases of the deep-profile overhaul together. Runs now carry a profile (quick/deep/soak); every profile walks the same 11-stage order — Inventory → Firmware → SpecValidate → SMART → CPUStress → Storage → Network → Burn → GPU → PSU → Reporting — with only per-stage durations and concurrency scaled. Phase 1: profiles.ProfileRegistry loaded from vetting.yaml; runs.profile column + CreateWithProfile; threshold table + evaluator seeded per-run from the shared vetting.thresholds block; breach flips result at /sensor + /result. Phase 2: upgraded CPUStress (stress-ng --cpu-method=all --verify + EDAC/MCE poll), Storage (fio --verify=md5 + SMART start/end delta), Network (sustained iperf + /proc/net/dev deltas) with per-profile knobs from Deps. Phase 3: Burn super-stage with goroutine fan-out for CPU + memory + fio + iperf, PSU rails sampled across the Burn window, SensorMux (2 s flush, 500-sample cap) to absorb backpressure. Phase 4: Firmware stage + firmware_snapshots table; probes dmidecode (BIOS), ipmitool (BMC), ethtool -i (NIC), nvme (sysfs + id-ctrl), lspci (HBA), /proc/cpuinfo (microcode). spec.DiffFirmware folds into SpecValidate with pin-by-identifier and fan-out-across-component matching; mismatches park the run in FailedHolding. Phase 5: profile radio on the host start form, profile chip on the run header, Firmware section in the HTML report, coverage artifact uploaded from CI, agent/tests/fakes/ scaffold with Deps.LookPath seam + stress_ng and dmidecode example fakes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 22:50:57 -04:00
josh	f79fe0f0db	ui: GitHub-Actions-style detail page, sub-steps, mini-tile run-view CI / Lint + build + test (push) Successful in 1m26s Details Release / release (push) Successful in 6m47s Details Reshapes the detail page into a run-view: hybrid horizontal pipeline + expanded active-step pane with sub-steps, a per-step log pane with line-numbered permalinks and client-side search, and a runs-history sidebar that navigates via ?run=N. Default step is server-picked (running → failed → Reporting) so the operator lands on the thing that's moving. Adds a sub_steps table + SSE topic (substep-{run}-{stage}-{ordinal}) so per-disk and per-pass work (SMART, CPUStress CPU/RAM, Storage, GPU) is visible in the UI instead of buried in stage summary JSON. Agent emits sub-step reports from existing per-iteration loops. Dashboard tiles become a mini run-view with a 9-dot step strip so the operator reads run health across the whole grid at a glance. Register page gets the same card shell + button styling. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 19:00:11 -04:00
josh	5c00edd7b6	ui: fix htmx-ext-sse integrity hash (was silently blocked by browser) CI / Lint + build + test (push) Successful in 1m20s Details Release / release (push) Successful in 5m48s Details Detail-page pipeline + log panes weren't updating without a manual refresh. Root cause: the integrity attribute on htmx-ext-sse@2.2.2 in layout.templ was wrong, so the browser refused to execute the script (SRI enforcement is silent — no user-visible error unless you open devtools). htmx core loaded, boosted nav worked, forms worked — but sse-connect/sse-swap were inert because the extension never registered, so no EventSource was ever opened. Replaced the claimed hash (Y4gc0CK6...) with the real one (fw+eTlCc...) computed via curl -sL https://unpkg.com/htmx-ext-sse@2.2.2 \| openssl dgst -sha384 -binary \| openssl base64 -A Added sse_e2e_test.go as a regression canary that mounts the real chi router (RealIP + Recoverer + Logger middleware), opens GET /events, publishes a tile-update via Runner, and asserts the event lands on the wire. Server-side unit tests only verified rendered HTML — this one covers the full publish→wire path, which is what the next regression in this area will hit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 17:51:58 -04:00
josh	bb658a8435	Host detail page + pipeline timeline CI / Lint + build + test (push) Has been cancelled Details Click a tile to open /hosts/{id} — the canonical control surface per host. Timeline renders every pre-stage, stage, and terminal node in order, with the current one pulsing, failed ones flagged, and downstream ones dimmed as skipped. Detail page shows summary, hold card (when holding), all action buttons, spec diffs, a full-height log pane, and a collapsed expected-spec YAML. Tile slims to name, last-seen, status, and one primary action; a CSS-overlay <a> makes the whole card clickable while buttons stay receptive via z-index. Runner.publishTileUpdate now also emits pipeline-{runID} fragments, and CompleteStage wraps Stages.CompleteByName so stage completions advance the timeline live — without this the dots only moved on state transitions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-17 23:59:43 -04:00
josh	8b3d9a312e	Add quick-register one-liner for target-host registration CI / Lint + build + test (push) Failing after 5m15s Details Operator pastes `curl -fsSL $ORCH/register/quick.sh \| sudo bash` on the target host (pre-wipe). The script probes MAC + CPU/RAM/disks/NICs/GPUs, emits an expected-spec YAML, and POSTs to a new LAN-trusted JSON endpoint /api/v1/hosts. The register page shows the command prefilled with the orchestrator URL; the manual form moves into a collapsible "Register manually" disclosure.	2026-04-17 22:50:54 -04:00
josh	42da48864f	Remove operator auth — trust the LAN CI / Lint + build + test (push) Failing after 5m15s Details Can't log in from a fresh LXC deploy, and the service is LAN-only by design. Rip out the whole bcrypt-password / signed-cookie session layer: internal/auth, login templates, gen-admin-password binary + Makefile targets, auth config block, login/logout routes and the RequireSession middleware wrap. Agent bearer-token auth on /api/v1/runs/{id}/* is untouched. Operators who want a password can front the service with a reverse proxy — noted in README and docs/operations.md.	2026-04-17 22:31:49 -04:00
josh	9bb4b09a04	Initial commit: full Phases 1-6 implementation CI / Lint + build + test (push) Has been cancelled Details Post-repair hardware validation pipeline for Proxmox cluster hosts. Go orchestrator + in-image agent + mkosi live image + bundled dnsmasq PXE + SQLite + HTMX/SSE UI + notify registry + janitor + full docs.	2026-04-17 21:32:10 -04:00

9 Commits