deep profile + threshold gating + firmware stage + Burn super-stage
CI / Lint + build + test (push) Failing after 1m57s
Release / release (push) Has been cancelled

Ships all five phases of the deep-profile overhaul together. Runs now
carry a profile (quick/deep/soak); every profile walks the same
11-stage order — Inventory → Firmware → SpecValidate → SMART →
CPUStress → Storage → Network → Burn → GPU → PSU → Reporting —
with only per-stage durations and concurrency scaled.

Phase 1: profiles.ProfileRegistry loaded from vetting.yaml; runs.profile
column + CreateWithProfile; threshold table + evaluator seeded per-run
from the shared vetting.thresholds block; breach flips result at
/sensor + /result.

Phase 2: upgraded CPUStress (stress-ng --cpu-method=all --verify +
EDAC/MCE poll), Storage (fio --verify=md5 + SMART start/end delta),
Network (sustained iperf + /proc/net/dev deltas) with per-profile
knobs from Deps.

Phase 3: Burn super-stage with goroutine fan-out for CPU + memory +
fio + iperf, PSU rails sampled across the Burn window, SensorMux
(2 s flush, 500-sample cap) to absorb backpressure.

Phase 4: Firmware stage + firmware_snapshots table; probes dmidecode
(BIOS), ipmitool (BMC), ethtool -i (NIC), nvme (sysfs + id-ctrl),
lspci (HBA), /proc/cpuinfo (microcode). spec.DiffFirmware folds into
SpecValidate with pin-by-identifier and fan-out-across-component
matching; mismatches park the run in FailedHolding.

Phase 5: profile radio on the host start form, profile chip on the
run header, Firmware section in the HTML report, coverage artifact
uploaded from CI, agent/tests/fakes/ scaffold with Deps.LookPath
seam + stress_ng and dmidecode example fakes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-04-18 22:50:57 -04:00
parent fbb21cbafd
commit 23c689aa5b
60 changed files with 5911 additions and 527 deletions
+32 -1
View File
@@ -28,7 +28,17 @@ type Data struct {
Host model.Host
Stages []model.Stage
SpecDiffs []model.SpecDiff
Aggregates []Aggregate // flattened measurement summary; see Aggregate
Aggregates []Aggregate // flattened measurement summary; see Aggregate
Firmware []FirmwareSnapshot // captured firmware versions, empty if none
}
// FirmwareSnapshot is the report-facing view of one firmware row.
// Package-local so the HTML template stays decoupled from store types.
type FirmwareSnapshot struct {
Component string
Identifier string
Version string
Vendor string
}
// Aggregate is a per (kind, key) summary of a run's measurements. Min/
@@ -196,6 +206,27 @@ const htmlTemplate = `<!doctype html>
</table>
</section>
<section>
<h2>Firmware ({{len .Firmware}})</h2>
{{if .Firmware}}
<table>
<thead><tr><th>Component</th><th>Identifier</th><th>Version</th><th>Vendor</th></tr></thead>
<tbody>
{{range .Firmware}}
<tr>
<td>{{.Component}}</td>
<td><code>{{.Identifier}}</code></td>
<td><code>{{.Version}}</code></td>
<td>{{.Vendor}}</td>
</tr>
{{end}}
</tbody>
</table>
{{else}}
<p>No firmware snapshots captured.</p>
{{end}}
</section>
<section>
<h2>Spec diffs ({{len .SpecDiffs}})</h2>
{{if .SpecDiffs}}