runs: add non-destructive flag + operator Cancel button
CI / Lint + build + test (push) Successful in 2m5s
Release / release (push) Successful in 3m5s

Non-destructive pre-declares "don't touch the disks" on Start: the
Storage stage skips wipe-probe, badblocks -w, and write-mode fio,
and reports a read-only summary. Runs a new non_destructive column;
threaded through Claim → agent tests.Deps → Storage stage.

Cancel halts an in-flight run. The orchestrator transitions to a
new StateCancelled via TriggerOperatorCancelled (valid from any
active state); the agent's next heartbeat returns cmd=cancel_stage,
which fires a stored CancelFunc on the per-stage context. Stage
subprocesses spawned with exec.CommandContext die with the context,
the agent posts a cancelled outcome, then powers the host off.

Destructive stages mid-run may leave the host in an intermediate
state — the UI confirm dialog warns the operator; recovery is
manual for now.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-04-18 13:01:42 -04:00
parent 2c440fce8a
commit 4524ab8dc0
22 changed files with 434 additions and 230 deletions
+7 -5
View File
@@ -18,8 +18,9 @@ const (
TriggerStageFailed Trigger = "StageFailed" // a stage reported failure
TriggerStageCompleted Trigger = "StageCompleted" // a stage reported success → advance
TriggerAllStagesPassed Trigger = "AllStagesPassed" // final stage passed
TriggerOperatorReleased Trigger = "OperatorReleased" // user clicked Release on a held run
TriggerOperatorOverride Trigger = "OperatorOverride" // user overrode a held stage; re-enter it
TriggerOperatorReleased Trigger = "OperatorReleased" // user clicked Release on a held run
TriggerOperatorOverride Trigger = "OperatorOverride" // user overrode a held stage; re-enter it
TriggerOperatorCancelled Trigger = "OperatorCancelled" // user clicked Cancel on an active run
)
// stageStates maps the canonical stage name (from DefaultStageOrder)
@@ -63,9 +64,10 @@ var table = map[Trigger]transition{
TriggerRebootCommanded: {from: []model.RunState{model.StateQueued}, to: model.StateWaitingReboot},
TriggerPXEObserved: {from: []model.RunState{model.StateWaitingReboot, model.StateWaitingWoL, model.StateBooting}, to: model.StateBooting},
TriggerAgentClaimed: {from: []model.RunState{model.StateBooting, model.StateWaitingReboot, model.StateWaitingWoL}, to: model.StateInventoryCheck},
TriggerStageFailed: {from: allActiveStates(), to: model.StateFailedHolding},
TriggerAllStagesPassed: {from: []model.RunState{model.StateReporting}, to: model.StateCompleted},
TriggerOperatorReleased: {from: []model.RunState{model.StateFailedHolding}, to: model.StateReleased},
TriggerStageFailed: {from: allActiveStates(), to: model.StateFailedHolding},
TriggerAllStagesPassed: {from: []model.RunState{model.StateReporting}, to: model.StateCompleted},
TriggerOperatorReleased: {from: []model.RunState{model.StateFailedHolding}, to: model.StateReleased},
TriggerOperatorCancelled: {from: allActiveStates(), to: model.StateCancelled},
}
// Next computes the target state for a trigger against the current state.