Files
Vetting/deploy/vetting.production.yaml
T
josh 506c856046
CI / Lint + build + test (push) Successful in 1m48s
Release / release (push) Successful in 2m22s
pxe: switch dnsmasq to proxy-DHCP mode on the LAN
Previously the orchestrator ran a full DHCP server on a dedicated
br-vetting bridge (10.77.0.0/24), which required a hypervisor-level
bridge + physical cabling onto that bridge for every repaired host.
Real-world bite: the LXC's br-vetting had no L2 path to the target
host's PXE NIC, so DHCPDISCOVERs never reached eth1 and PXE silently
timed out.

dnsmasq's proxy-DHCP mode is the idiomatic answer: it coexists with
the LAN's existing DHCP server (UniFi, etc.), never assigns an IP
itself, and only supplements the PXE options. No dedicated bridge,
no VLAN, no cabling changes \u2014 dnsmasq binds to the LAN interface
and layers option 66/67 + the PXE BINL on top of the real DHCP
exchange. The MAC allowlist still gates replies, so random LAN
clients booting from network get nothing.

Template switches dhcp-range=<start,end,lease> to
dhcp-range=<cidr>,proxy and replaces dhcp-boot= for first-boot ROM
clients with pxe-service= directives (the correct proxy-mode
chainload form). Validation drops the dhcp_range regex for a
net.ParseCIDR check on pxe.subnet. Config, production/example yaml,
and pxe-setup.sh swap --dhcp-range for --subnet.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 12:02:49 -04:00

78 lines
2.3 KiB
YAML

server:
# Loopback-only by default; change to "0.0.0.0:8080" (or similar) once
# you've wired up TLS or fronted the service with a reverse proxy.
bind: "127.0.0.1:8080"
# Base URL the orchestrator is reachable at from the operator's
# browser. Used as the click-through link in notifications.
public_url: "http://127.0.0.1:8080"
tls:
enabled: false
cert_file: ""
key_file: ""
database:
path: "/var/lib/vetting/vetting.db"
artifacts:
dir: "/var/lib/vetting/artifacts"
# Days to keep per-run artifact files (report.html, report.json, fio,
# iperf, inventory.json, hold keys). DB rows are preserved. 0 = forever.
retention_days: 30
logs:
dir: "/var/log/vetting"
# Days to keep per-run log files. 0 = forever.
retention_days: 30
janitor:
# Interval between cleanup sweeps. 0 defaults to 60.
interval_minutes: 60
dispatcher:
max_concurrent_runs: 3
pxe:
enabled: false
interface: "" # LAN NIC, e.g. "eth0"
subnet: "" # LAN CIDR, e.g. "192.168.1.0/24"; dnsmasq runs in proxy-DHCP mode scoped to this subnet, coexisting with the LAN's existing DHCP server
orchestrator_url: "" # e.g. "http://192.168.1.135:8080"
tftp_root: "/var/lib/vetting/tftp" # holds ipxe.efi + undionly.kpxe
live_dir: "/var/lib/vetting/live" # holds vmlinuz + initrd.img; served at /live/*
agent:
# Directory holding vetting-agent-linux-amd64, served at
# /assets/vetting-agent-linux-amd64. install.sh drops the binary here.
asset_dir: "/var/lib/vetting/assets"
# Notifications fire on StageFailed, SpecMismatch, HoldingOpened,
# RunCompleted. Declare one or more notifiers and route each event
# kind (and optionally severity) to a notifier by name. Delivery is
# fire-and-forget (one attempt per event, logged on failure).
#
# Example (uncomment and fill in):
#
# notifiers:
# - name: ops-ntfy
# type: ntfy
# server: https://ntfy.sh
# topic: vetting-YOUR-TOPIC
# - name: ops-discord
# type: discord
# webhook_url: https://discord.com/api/webhooks/XXX/YYY
# - name: ops-email
# type: smtp
# smtp:
# host: mail.lan
# port: 25
# from: vetting@lan.local
# to: [ops@lan.local]
#
# routes:
# - match_severity: [critical]
# notifier: ops-ntfy
# - match_kind: [RunCompleted]
# notifier: ops-ntfy
notifiers: []
routes: []