15 nodes (3 pre-stage + 11 stage + Completed) exceeded the 1280px main
container's usable width, producing a horizontal scrollbar under the
pipeline on the run page. Widen main to 1440px, tighten per-node min
widths, drop the scrollbar, and split camelCase labels so multi-word
stages ("WaitingReboot", "SpecValidate", "CPUStress") wrap onto two
lines instead of forcing node width.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a 1s client-side ticker that rewrites .run-duration text from a
data-started-at attribute, so the header timer on /runs/{id}
increments every second while the run is active. When an SSE swap
lands a fresh header the new server-rendered value seamlessly takes
over; when the run goes terminal the template drops the attribute
and the ticker silently skips the node, leaving the final elapsed in
place.
Other templ_*.go churn is cosmetic — regenerator versions differ
between CI and local and only the filename field in templ.Error
callsites changed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Ships all five phases of the deep-profile overhaul together. Runs now
carry a profile (quick/deep/soak); every profile walks the same
11-stage order — Inventory → Firmware → SpecValidate → SMART →
CPUStress → Storage → Network → Burn → GPU → PSU → Reporting —
with only per-stage durations and concurrency scaled.
Phase 1: profiles.ProfileRegistry loaded from vetting.yaml; runs.profile
column + CreateWithProfile; threshold table + evaluator seeded per-run
from the shared vetting.thresholds block; breach flips result at
/sensor + /result.
Phase 2: upgraded CPUStress (stress-ng --cpu-method=all --verify +
EDAC/MCE poll), Storage (fio --verify=md5 + SMART start/end delta),
Network (sustained iperf + /proc/net/dev deltas) with per-profile
knobs from Deps.
Phase 3: Burn super-stage with goroutine fan-out for CPU + memory +
fio + iperf, PSU rails sampled across the Burn window, SensorMux
(2 s flush, 500-sample cap) to absorb backpressure.
Phase 4: Firmware stage + firmware_snapshots table; probes dmidecode
(BIOS), ipmitool (BMC), ethtool -i (NIC), nvme (sysfs + id-ctrl),
lspci (HBA), /proc/cpuinfo (microcode). spec.DiffFirmware folds into
SpecValidate with pin-by-identifier and fan-out-across-component
matching; mismatches park the run in FailedHolding.
Phase 5: profile radio on the host start form, profile chip on the
run header, Firmware section in the HTML report, coverage artifact
uploaded from CI, agent/tests/fakes/ scaffold with Deps.LookPath
seam + stress_ng and dmidecode example fakes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Host page owns host metadata, full runs table with per-row stage strip,
in-flight banner, and empty-state CTA. Run page owns pipeline, active
step, logs, sub-steps, spec diffs, and hold banner with a breadcrumb
back to the host. Dashboard tile reverts to host-only.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The .log-line grid was templated with 5 columns (anchor/ln/lvl/ts/text),
but renderLogSSE inserts an optional log-stage span, making 6 children.
The 6th child wrapped to row 2 column 1 (24px wide), which forced the
message text to break one character per line. Flexbox with min-width:0
on the text span scales cleanly with or without the stage element.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reshapes the detail page into a run-view: hybrid horizontal pipeline
+ expanded active-step pane with sub-steps, a per-step log pane with
line-numbered permalinks and client-side search, and a runs-history
sidebar that navigates via ?run=N. Default step is server-picked
(running → failed → Reporting) so the operator lands on the thing
that's moving.
Adds a sub_steps table + SSE topic (substep-{run}-{stage}-{ordinal})
so per-disk and per-pass work (SMART, CPUStress CPU/RAM, Storage,
GPU) is visible in the UI instead of buried in stage summary JSON.
Agent emits sub-step reports from existing per-iteration loops.
Dashboard tiles become a mini run-view with a 9-dot step strip so
the operator reads run health across the whole grid at a glance.
Register page gets the same card shell + button styling.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pipeline now always renders all 13 nodes (3 pre-stage + 9 stage +
Completed), synthesising ghosts from run state when stage rows
aren't seeded yet. Makes a WaitingWoL host show the full timeline
ahead of it instead of just 4 dots.
Agent tags each log line with its stage; logs.Hub fans out to both
log-{runID} and log-{runID}-{stage} SSE events so the detail page
can show per-stage tabs with a pure-CSS radio-sibling switch. Flat
run log prepends [stage] so grep still works.
Dispatcher writes picked/sent-WoL/heartbeat lines into the per-run
log — the operator opens the detail page, sees WaitingWoL stuck,
and reads exactly what the dispatcher did and why nothing's
progressing, instead of having to tail journalctl on the LXC.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Click a tile to open /hosts/{id} — the canonical control surface per
host. Timeline renders every pre-stage, stage, and terminal node in
order, with the current one pulsing, failed ones flagged, and
downstream ones dimmed as skipped. Detail page shows summary, hold
card (when holding), all action buttons, spec diffs, a full-height
log pane, and a collapsed expected-spec YAML.
Tile slims to name, last-seen, status, and one primary action; a
CSS-overlay <a> makes the whole card clickable while buttons stay
receptive via z-index.
Runner.publishTileUpdate now also emits pipeline-{runID} fragments,
and CompleteStage wraps Stages.CompleteByName so stage completions
advance the timeline live — without this the dots only moved on
state transitions.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
vetting-agent gains a `host` subcommand that runs as a systemd service
installed by the quick-register one-liner, POSTing every 30s to
/api/v1/hosts/{mac}/heartbeat so the dashboard tile shows "online" or
"Nm ago" without waiting on WoL. Ships dormant client code for the
Phase 2 reboot_for_vetting command so the server can flip it on later
without a binary redeploy.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Operator pastes `curl -fsSL $ORCH/register/quick.sh | sudo bash` on the
target host (pre-wipe). The script probes MAC + CPU/RAM/disks/NICs/GPUs,
emits an expected-spec YAML, and POSTs to a new LAN-trusted JSON
endpoint /api/v1/hosts. The register page shows the command prefilled
with the orchestrator URL; the manual form moves into a collapsible
"Register manually" disclosure.