deep profile + threshold gating + firmware stage + Burn super-stage
Ships all five phases of the deep-profile overhaul together. Runs now carry a profile (quick/deep/soak); every profile walks the same 11-stage order — Inventory → Firmware → SpecValidate → SMART → CPUStress → Storage → Network → Burn → GPU → PSU → Reporting — with only per-stage durations and concurrency scaled. Phase 1: profiles.ProfileRegistry loaded from vetting.yaml; runs.profile column + CreateWithProfile; threshold table + evaluator seeded per-run from the shared vetting.thresholds block; breach flips result at /sensor + /result. Phase 2: upgraded CPUStress (stress-ng --cpu-method=all --verify + EDAC/MCE poll), Storage (fio --verify=md5 + SMART start/end delta), Network (sustained iperf + /proc/net/dev deltas) with per-profile knobs from Deps. Phase 3: Burn super-stage with goroutine fan-out for CPU + memory + fio + iperf, PSU rails sampled across the Burn window, SensorMux (2 s flush, 500-sample cap) to absorb backpressure. Phase 4: Firmware stage + firmware_snapshots table; probes dmidecode (BIOS), ipmitool (BMC), ethtool -i (NIC), nvme (sysfs + id-ctrl), lspci (HBA), /proc/cpuinfo (microcode). spec.DiffFirmware folds into SpecValidate with pin-by-identifier and fan-out-across-component matching; mismatches park the run in FailedHolding. Phase 5: profile radio on the host start form, profile chip on the run header, Firmware section in the HTML report, coverage artifact uploaded from CI, agent/tests/fakes/ scaffold with Deps.LookPath seam + stress_ng and dmidecode example fakes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,24 @@
|
||||
// fake_dmidecode simulates `dmidecode -t bios` for unit tests of the
|
||||
// firmware probe's BIOS parser. Prints deterministic output modeled on
|
||||
// a real Supermicro host; exits 0 regardless of flags.
|
||||
package main
|
||||
|
||||
import "fmt"
|
||||
|
||||
func main() {
|
||||
fmt.Println(`# dmidecode 3.3
|
||||
Getting SMBIOS data from sysfs.
|
||||
SMBIOS 3.2.0 present.
|
||||
|
||||
Handle 0x0000, DMI type 0, 26 bytes
|
||||
BIOS Information
|
||||
Vendor: American Megatrends Inc.
|
||||
Version: 3.2
|
||||
Release Date: 07/15/2021
|
||||
Address: 0xF0000
|
||||
Runtime Size: 64 kB
|
||||
ROM Size: 32 MB
|
||||
Characteristics:
|
||||
PCI is supported
|
||||
BIOS is upgradeable`)
|
||||
}
|
||||
@@ -0,0 +1,22 @@
|
||||
// Package fakes is the umbrella for deterministic stand-ins for
|
||||
// external probe binaries that Vetting's stage code normally shells
|
||||
// out to (stress-ng, fio, iperf3, dmidecode, ethtool, nvidia-smi,
|
||||
// mcelog, nvme). Each real binary gets its own subpackage under
|
||||
// fakes/<name>/ with `package main` and a main() that prints golden
|
||||
// output — build with `go build -o <tmp>/<name> ./agent/tests/fakes/<name>`
|
||||
// and point a test's tests.Deps.LookPath at <tmp>/<name>.
|
||||
//
|
||||
// The seam in tests is tests.Deps.LookPath: when non-nil the stage
|
||||
// code uses it instead of os/exec.LookPath. Outside tests, nil
|
||||
// LookPath means "use the real binary on $PATH" — stages continue to
|
||||
// work on production hosts without the fakes package around.
|
||||
//
|
||||
// How to add a new fake:
|
||||
// 1. Create agent/tests/fakes/<binaryname>/main.go.
|
||||
// 2. Write `package main` with a main() that prints exactly the
|
||||
// bytes the real tool would produce for the input you care to
|
||||
// simulate. Determinism > completeness — tests want a known
|
||||
// sample, not a realistic one.
|
||||
// 3. Reference the fake from the unit test with `go test` compiling
|
||||
// it via t.TempDir() + `go build -o` before the test body runs.
|
||||
package fakes
|
||||
@@ -0,0 +1,18 @@
|
||||
// fake_stress_ng simulates stress-ng for unit tests. Accepts (and
|
||||
// ignores) any flag, sleeps briefly so callers that measure wall-clock
|
||||
// see a non-zero elapsed, and prints the "passed" lines CPUStress
|
||||
// expects. Exits 0.
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"time"
|
||||
)
|
||||
|
||||
func main() {
|
||||
fmt.Fprintln(os.Stderr, "fake_stress_ng invoked:", os.Args[1:])
|
||||
time.Sleep(50 * time.Millisecond)
|
||||
fmt.Println("stress-ng: info: [1] dispatching hogs: 1 cpu")
|
||||
fmt.Println("stress-ng: info: [1] successful run completed in 0.05s")
|
||||
}
|
||||
Reference in New Issue
Block a user