feat(end-of-run): reboot to local disk instead of powering off
CI / Lint + build + test (push) Successful in 1m47s
Release / release (push) Successful in 10m8s

Completed runs now reboot the host and fall through iPXE to the next
boot device (local disk) instead of powering off. Three coordinated
changes:

- pxe/ipxe: NoActiveRunScript exits iPXE (drops to next boot entry)
  instead of `sleep 10; poweroff`. Without this, a Completed reboot
  just loops through PXE and gets told to poweroff.
- api/agent_handlers: heartbeat returns cmd=reboot (was cmd=shutdown)
  when the run reaches Completed.
- agent/runner: runs `systemctl reboot` (with `shutdown -r now`
  fallback) in response to cmd=reboot.

Operator cancel still powers off — powerOffAndReturn is unchanged
because a cancel means the operator wants the host idle so they can
walk up to it, not back in rotation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-04-19 22:45:11 -04:00
parent 8acef92a60
commit 3656af9823
4 changed files with 23 additions and 17 deletions
+5 -3
View File
@@ -82,10 +82,12 @@ func NotRegisteredScript(mac string) string {
}
// NoActiveRunScript is served when a registered MAC PXE-boots but has
// no currently active run. The host is told to shut down rather than
// loop forever.
// no currently active run. `exit` drops back to the firmware so the
// next configured boot entry (local disk) fires — this is what makes a
// post-Completed reboot come back up on the installed OS instead of
// looping through PXE and powering off.
func NoActiveRunScript(mac string) string {
return fmt.Sprintf("#!ipxe\necho MAC %s has no active run — powering off in 10s.\nsleep 10\npoweroff\n", mac)
return fmt.Sprintf("#!ipxe\necho MAC %s has no active run — exiting to next boot device.\nsleep 2\nexit\n", mac)
}
// Used by handlers to compose URLs; exposed for tests.