feat(end-of-run): reboot to local disk instead of powering off
CI / Lint + build + test (push) Successful in 1m47s
Release / release (push) Successful in 10m8s

Completed runs now reboot the host and fall through iPXE to the next
boot device (local disk) instead of powering off. Three coordinated
changes:

- pxe/ipxe: NoActiveRunScript exits iPXE (drops to next boot entry)
  instead of `sleep 10; poweroff`. Without this, a Completed reboot
  just loops through PXE and gets told to poweroff.
- api/agent_handlers: heartbeat returns cmd=reboot (was cmd=shutdown)
  when the run reaches Completed.
- agent/runner: runs `systemctl reboot` (with `shutdown -r now`
  fallback) in response to cmd=reboot.

Operator cancel still powers off — powerOffAndReturn is unchanged
because a cancel means the operator wants the host idle so they can
walk up to it, not back in rotation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-04-19 22:45:11 -04:00
parent 8acef92a60
commit 3656af9823
4 changed files with 23 additions and 17 deletions
+5 -2
View File
@@ -266,8 +266,11 @@ func (a *Agent) Heartbeat(w http.ResponseWriter, r *http.Request) {
resp := map[string]any{"state": run.State}
switch {
case run.State == model.StateCompleted:
// Pipeline succeeded — agent should power the host down.
cmd = "shutdown"
// Pipeline succeeded — agent reboots so the host falls through
// iPXE's no-active-run script to the next boot device (local
// disk), landing back on the installed OS without operator
// intervention.
cmd = "reboot"
case run.State == model.StateCancelled:
// Operator clicked Cancel — agent cancels the active stage ctx,
// posts a cancelled outcome, and powers off.