deep profile + threshold gating + firmware stage + Burn super-stage
Ships all five phases of the deep-profile overhaul together. Runs now carry a profile (quick/deep/soak); every profile walks the same 11-stage order — Inventory → Firmware → SpecValidate → SMART → CPUStress → Storage → Network → Burn → GPU → PSU → Reporting — with only per-stage durations and concurrency scaled. Phase 1: profiles.ProfileRegistry loaded from vetting.yaml; runs.profile column + CreateWithProfile; threshold table + evaluator seeded per-run from the shared vetting.thresholds block; breach flips result at /sensor + /result. Phase 2: upgraded CPUStress (stress-ng --cpu-method=all --verify + EDAC/MCE poll), Storage (fio --verify=md5 + SMART start/end delta), Network (sustained iperf + /proc/net/dev deltas) with per-profile knobs from Deps. Phase 3: Burn super-stage with goroutine fan-out for CPU + memory + fio + iperf, PSU rails sampled across the Burn window, SensorMux (2 s flush, 500-sample cap) to absorb backpressure. Phase 4: Firmware stage + firmware_snapshots table; probes dmidecode (BIOS), ipmitool (BMC), ethtool -i (NIC), nvme (sysfs + id-ctrl), lspci (HBA), /proc/cpuinfo (microcode). spec.DiffFirmware folds into SpecValidate with pin-by-identifier and fan-out-across-component matching; mismatches park the run in FailedHolding. Phase 5: profile radio on the host start form, profile chip on the run header, Firmware section in the HTML report, coverage artifact uploaded from CI, agent/tests/fakes/ scaffold with Deps.LookPath seam + stress_ng and dmidecode example fakes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -124,6 +124,56 @@ type ClaimResponse struct {
|
||||
// at the right stage instead of silently replaying Inventory and
|
||||
// letting the orchestrator advance past the crashed stage.
|
||||
CurrentState string `json:"current_state"`
|
||||
// StageConfig carries per-profile stage knobs (Phase 2): stage-level
|
||||
// timeouts and probe-level durations/modes. Empty when the agent
|
||||
// talks to a pre-Phase-2 orchestrator; the agent applies compile-
|
||||
// time defaults in that case.
|
||||
StageConfig ClaimStageConfig `json:"stage_config"`
|
||||
}
|
||||
|
||||
// ClaimStageConfig mirrors config.StageConfig server-side — duplicated so
|
||||
// the agent doesn't need to import internal/config. Durations arrive as
|
||||
// strings ("2m", "2h") and are parsed by the tests package at the point
|
||||
// of use. An empty field means "use the agent-side default" so a missing
|
||||
// knob doesn't silently turn CPUStress / Storage into a no-op.
|
||||
type ClaimStageConfig struct {
|
||||
Profile string `json:"profile"`
|
||||
StageTimeouts map[string]string `json:"stage_timeouts,omitempty"`
|
||||
CPUStress ClaimCPUStressKnobs `json:"cpustress"`
|
||||
Storage ClaimStorageKnobs `json:"storage"`
|
||||
Network ClaimNetworkKnobs `json:"network"`
|
||||
Burn ClaimBurnKnobs `json:"burn"`
|
||||
}
|
||||
|
||||
type ClaimCPUStressKnobs struct {
|
||||
CPUPass string `json:"cpu_pass,omitempty"`
|
||||
MemPass string `json:"mem_pass,omitempty"`
|
||||
EDACPoll string `json:"edac_poll,omitempty"`
|
||||
}
|
||||
|
||||
type ClaimStorageKnobs struct {
|
||||
Mode string `json:"mode,omitempty"`
|
||||
FioSize string `json:"fio_size,omitempty"`
|
||||
FioTime string `json:"fio_time,omitempty"`
|
||||
FioBS string `json:"fio_bs,omitempty"`
|
||||
FioRW string `json:"fio_rw,omitempty"`
|
||||
Verify string `json:"verify,omitempty"`
|
||||
}
|
||||
|
||||
type ClaimNetworkKnobs struct {
|
||||
Duration string `json:"duration,omitempty"`
|
||||
}
|
||||
|
||||
// ClaimBurnKnobs mirrors config.BurnKnobs. Duration/CPUWorkers arrive as
|
||||
// strings so the agent can treat empty as "use compile-time default".
|
||||
// MemPct is a percentage (0-100); IperfParallel is the parallel stream
|
||||
// count fed to iperf3 -P. FioOnSpare gates whether fio runs inside Burn.
|
||||
type ClaimBurnKnobs struct {
|
||||
Duration string `json:"duration,omitempty"`
|
||||
CPUWorkers string `json:"cpu_workers,omitempty"`
|
||||
MemPct int `json:"mem_pct,omitempty"`
|
||||
FioOnSpare bool `json:"fio_on_spare,omitempty"`
|
||||
IperfParallel int `json:"iperf_parallel,omitempty"`
|
||||
}
|
||||
|
||||
type ClaimExpectedDiskSpec struct {
|
||||
|
||||
Reference in New Issue
Block a user