Files
Vetting/agent/probes/netdev.go
T
josh 23c689aa5b
CI / Lint + build + test (push) Failing after 1m57s
Release / release (push) Has been cancelled
deep profile + threshold gating + firmware stage + Burn super-stage
Ships all five phases of the deep-profile overhaul together. Runs now
carry a profile (quick/deep/soak); every profile walks the same
11-stage order — Inventory → Firmware → SpecValidate → SMART →
CPUStress → Storage → Network → Burn → GPU → PSU → Reporting —
with only per-stage durations and concurrency scaled.

Phase 1: profiles.ProfileRegistry loaded from vetting.yaml; runs.profile
column + CreateWithProfile; threshold table + evaluator seeded per-run
from the shared vetting.thresholds block; breach flips result at
/sensor + /result.

Phase 2: upgraded CPUStress (stress-ng --cpu-method=all --verify +
EDAC/MCE poll), Storage (fio --verify=md5 + SMART start/end delta),
Network (sustained iperf + /proc/net/dev deltas) with per-profile
knobs from Deps.

Phase 3: Burn super-stage with goroutine fan-out for CPU + memory +
fio + iperf, PSU rails sampled across the Burn window, SensorMux
(2 s flush, 500-sample cap) to absorb backpressure.

Phase 4: Firmware stage + firmware_snapshots table; probes dmidecode
(BIOS), ipmitool (BMC), ethtool -i (NIC), nvme (sysfs + id-ctrl),
lspci (HBA), /proc/cpuinfo (microcode). spec.DiffFirmware folds into
SpecValidate with pin-by-identifier and fan-out-across-component
matching; mismatches park the run in FailedHolding.

Phase 5: profile radio on the host start form, profile chip on the
run header, Firmware section in the HTML report, coverage artifact
uploaded from CI, agent/tests/fakes/ scaffold with Deps.LookPath
seam + stress_ng and dmidecode example fakes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 22:50:57 -04:00

86 lines
2.3 KiB
Go

package probes
import (
"bufio"
"io"
"os"
"strconv"
"strings"
)
// NetDevSnapshot is the per-interface counter row from /proc/net/dev at
// a single instant. Used by the Network stage to compute deltas across
// an iperf window — a rising rx_errors or tx_dropped during a loaded
// link is a real NIC problem, not general noise.
type NetDevSnapshot struct {
Iface string
RxBytes uint64
RxErrs uint64
RxDrop uint64
TxBytes uint64
TxErrs uint64
TxDrop uint64
}
// NetDev reads /proc/net/dev and returns one snapshot per non-loopback
// interface. Returns nil on read/parse failure (best-effort: a missing
// /proc is survivable; the caller skips delta reporting that tick).
func NetDev() []NetDevSnapshot {
f, err := os.Open("/proc/net/dev")
if err != nil {
return nil
}
defer func() { _ = f.Close() }()
return parseNetDev(f)
}
// parseNetDev is split from NetDev so tests can feed a fixture without
// touching the real /proc. The /proc/net/dev format is two header lines
// followed by rows of "iface: rx_bytes rx_packets rx_errs rx_drop ... tx_bytes tx_packets tx_errs tx_drop ..."
// — 16 whitespace-separated counters, of which we pull a curated six.
func parseNetDev(r io.Reader) []NetDevSnapshot {
var out []NetDevSnapshot
sc := bufio.NewScanner(r)
// Skip the two header lines (iface || bytes ... || bytes ...).
for i := 0; i < 2 && sc.Scan(); i++ {
}
for sc.Scan() {
line := strings.TrimSpace(sc.Text())
if line == "" {
continue
}
colon := strings.IndexByte(line, ':')
if colon < 0 {
continue
}
iface := strings.TrimSpace(line[:colon])
if iface == "" || iface == "lo" {
continue
}
fields := strings.Fields(line[colon+1:])
if len(fields) < 16 {
continue
}
// /proc/net/dev columns:
// 0 rx_bytes 1 rx_packets 2 rx_errs 3 rx_drop 4 fifo 5 frame 6 compressed 7 multicast
// 8 tx_bytes 9 tx_packets 10 tx_errs 11 tx_drop 12 fifo 13 colls 14 carrier 15 compressed
snap := NetDevSnapshot{Iface: iface}
snap.RxBytes = parseU64(fields[0])
snap.RxErrs = parseU64(fields[2])
snap.RxDrop = parseU64(fields[3])
snap.TxBytes = parseU64(fields[8])
snap.TxErrs = parseU64(fields[10])
snap.TxDrop = parseU64(fields[11])
out = append(out, snap)
}
return out
}
func parseU64(s string) uint64 {
n, err := strconv.ParseUint(s, 10, 64)
if err != nil {
return 0
}
return n
}