Fix Fill Capacity exceeding DC slot limit due to double-counted failed racks
CI / build-and-push (push) Successful in 38s
CI / build-and-push (push) Successful in 38s
computeRacksFailed was incremented on production failure and never decremented when repaired racks came back online, while repair cohorts also tracked the same racks. This caused usedSlots to inflate past the DC capacity over time. Fix: derive computeRacksFailed from repair cohorts each tick instead of maintaining it as a running counter. Include repair cohorts in pipeline slot accounting so all racks are counted exactly once. Also fixes power limit in fillDCToCapacity to only count online racks (pipeline racks don't draw power). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -71,11 +71,15 @@ function resetRackFailures() {
|
||||
...cluster,
|
||||
campuses: cluster.campuses.map((campus) => ({
|
||||
...campus,
|
||||
dataCenters: campus.dataCenters.map((dc) => ({
|
||||
...dc,
|
||||
computeRacksOnline: dc.computeRacksOnline + dc.computeRacksFailed,
|
||||
computeRacksFailed: 0,
|
||||
})),
|
||||
dataCenters: campus.dataCenters.map((dc) => {
|
||||
const repairCount = dc.deploymentCohorts.filter(c => c.stage === 'repair').reduce((sum, c) => sum + c.count, 0);
|
||||
return {
|
||||
...dc,
|
||||
computeRacksOnline: dc.computeRacksOnline + repairCount,
|
||||
computeRacksFailed: 0,
|
||||
deploymentCohorts: dc.deploymentCohorts.filter(c => c.stage !== 'repair'),
|
||||
};
|
||||
}),
|
||||
})),
|
||||
}));
|
||||
return { infrastructure: { ...s.infrastructure, clusters: newClusters } };
|
||||
|
||||
Reference in New Issue
Block a user