deep profile + threshold gating + firmware stage + Burn super-stage
Ships all five phases of the deep-profile overhaul together. Runs now carry a profile (quick/deep/soak); every profile walks the same 11-stage order — Inventory → Firmware → SpecValidate → SMART → CPUStress → Storage → Network → Burn → GPU → PSU → Reporting — with only per-stage durations and concurrency scaled. Phase 1: profiles.ProfileRegistry loaded from vetting.yaml; runs.profile column + CreateWithProfile; threshold table + evaluator seeded per-run from the shared vetting.thresholds block; breach flips result at /sensor + /result. Phase 2: upgraded CPUStress (stress-ng --cpu-method=all --verify + EDAC/MCE poll), Storage (fio --verify=md5 + SMART start/end delta), Network (sustained iperf + /proc/net/dev deltas) with per-profile knobs from Deps. Phase 3: Burn super-stage with goroutine fan-out for CPU + memory + fio + iperf, PSU rails sampled across the Burn window, SensorMux (2 s flush, 500-sample cap) to absorb backpressure. Phase 4: Firmware stage + firmware_snapshots table; probes dmidecode (BIOS), ipmitool (BMC), ethtool -i (NIC), nvme (sysfs + id-ctrl), lspci (HBA), /proc/cpuinfo (microcode). spec.DiffFirmware folds into SpecValidate with pin-by-identifier and fan-out-across-component matching; mismatches park the run in FailedHolding. Phase 5: profile radio on the host start form, profile chip on the run header, Firmware section in the HTML report, coverage artifact uploaded from CI, agent/tests/fakes/ scaffold with Deps.LookPath seam + stress_ng and dmidecode example fakes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -30,11 +30,13 @@ const (
|
||||
// "InventoryCheck". Later stages share a name with their state.
|
||||
var stageStates = map[string]model.RunState{
|
||||
"Inventory": model.StateInventoryCheck,
|
||||
"Firmware": model.StateFirmware,
|
||||
"SpecValidate": model.StateSpecValidate,
|
||||
"SMART": model.StateSMART,
|
||||
"CPUStress": model.StateCPUStress,
|
||||
"Storage": model.StateStorage,
|
||||
"Network": model.StateNetwork,
|
||||
"Burn": model.StateBurn,
|
||||
"GPU": model.StateGPU,
|
||||
"PSU": model.StatePSU,
|
||||
"Reporting": model.StateReporting,
|
||||
@@ -44,11 +46,13 @@ var stageStates = map[string]model.RunState{
|
||||
// first stage to Completed. Kept in sync with store.DefaultStageOrder.
|
||||
var stageOrder = []model.RunState{
|
||||
model.StateInventoryCheck,
|
||||
model.StateFirmware,
|
||||
model.StateSpecValidate,
|
||||
model.StateSMART,
|
||||
model.StateCPUStress,
|
||||
model.StateStorage,
|
||||
model.StateNetwork,
|
||||
model.StateBurn,
|
||||
model.StateGPU,
|
||||
model.StatePSU,
|
||||
model.StateReporting,
|
||||
@@ -143,9 +147,9 @@ func nextStageState(current model.RunState) (model.RunState, error) {
|
||||
func allActiveStates() []model.RunState {
|
||||
return []model.RunState{
|
||||
model.StateQueued, model.StateWaitingWoL, model.StateWaitingReboot, model.StateBooting,
|
||||
model.StateInventoryCheck, model.StateSpecValidate, model.StateSMART,
|
||||
model.StateInventoryCheck, model.StateFirmware, model.StateSpecValidate, model.StateSMART,
|
||||
model.StateCPUStress, model.StateStorage, model.StateNetwork,
|
||||
model.StateGPU, model.StatePSU, model.StateReporting,
|
||||
model.StateBurn, model.StateGPU, model.StatePSU, model.StateReporting,
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user