AIHostingTycoon

josh/AIHostingTycoon

Fork 0

Commit Graph

Author	SHA1	Message	Date
josh	57a81be769	Cache serving pipeline fleet to eliminate per-tick rebuilds and reduce GC pressure Fleet template is now rebuilt only when deploymentVersion changes (~68 times per 28,800-tick run instead of every tick). Reuses module-level Maps, arrays, and utilization objects instead of allocating new ones each tick. Replaces 4x Object.values().reduce() with single-pass aggregation and sorts fleet in-place. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-26 19:51:13 -04:00
josh	102e05c8ba	Add game-simulation package with multi-run balance testing, fix stalled-pipeline trap Balance Check / balance-simulation (push) Failing after 11m32s Details Balance Check / multi-run-balance (push) Failing after 23m46s Details CI / build-and-push (push) Successful in 1m20s Details Adds a full simulation harness (game-simulation package) with greedy/random strategies, 36-metric diagnostics, multi-run orchestration via child processes, and a statistical interpreter. Includes 2.3x engine performance optimizations (research bonus caching, per-DC dirty tracking, reduced allocations in tick pipeline, single-pass loops). Fixes a critical balance bug where training pipelines stalled on insufficient VRAM would permanently block training slots — the engine never re-checked stalled pipelines, and the greedy strategy didn't pre-check VRAM requirements. This caused 20-25% of seeds to get stuck in Scale-up era. All three fixes (engine un-stalling, strategy VRAM pre-check, stalled pipeline cancellation) bring pass rate from 75% to 100% across 20 random seeds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-26 06:11:26 -04:00
josh	901db02a6b	Replace decorative overload policy with real serving pipeline and dedicated Serving page CI / build-and-push (push) Successful in 28s Details The old overload policy had dead controls (maxQueueDepth, rateLimitPerCustomer never read) and trivial flat penalties. This replaces it with a full serving pipeline where deployed models form a fleet, requests route through priority/degradation logic, and policy choices create meaningful strategic tradeoffs. New serving pipeline: fleet building from deployed models (size/quant/MoE multipliers), demand categorization by 5 priority tiers, enterprise capacity reservation, priority-ordered serving with overflow behaviors (queue/reject/degrade), auto-degradation to faster models under load, and Batch API to fill idle capacity at discounted rates. 4 new research nodes gate features progressively: Intelligent Request Routing, Priority Queue System, Request Batching, and Auto-Scaling. New dedicated Serving page with pipeline metrics, model fleet utilization, and research-gated policy controls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-25 12:42:09 -04:00

Author

SHA1

Message

Date

josh

57a81be769

Cache serving pipeline fleet to eliminate per-tick rebuilds and reduce GC pressure

Fleet template is now rebuilt only when deploymentVersion changes (~68 times per
28,800-tick run instead of every tick). Reuses module-level Maps, arrays, and
utilization objects instead of allocating new ones each tick. Replaces 4x
Object.values().reduce() with single-pass aggregation and sorts fleet in-place.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-26 19:51:13 -04:00

josh

102e05c8ba

Add game-simulation package with multi-run balance testing, fix stalled-pipeline trap

Balance Check / balance-simulation (push) Failing after 11m32s

Details

Balance Check / multi-run-balance (push) Failing after 23m46s

Details

CI / build-and-push (push) Successful in 1m20s

Details

Adds a full simulation harness (game-simulation package) with greedy/random strategies,
36-metric diagnostics, multi-run orchestration via child processes, and a statistical
interpreter. Includes 2.3x engine performance optimizations (research bonus caching,
per-DC dirty tracking, reduced allocations in tick pipeline, single-pass loops).

Fixes a critical balance bug where training pipelines stalled on insufficient VRAM would
permanently block training slots — the engine never re-checked stalled pipelines, and the
greedy strategy didn't pre-check VRAM requirements. This caused 20-25% of seeds to get
stuck in Scale-up era. All three fixes (engine un-stalling, strategy VRAM pre-check,
stalled pipeline cancellation) bring pass rate from 75% to 100% across 20 random seeds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-26 06:11:26 -04:00

josh

901db02a6b

Replace decorative overload policy with real serving pipeline and dedicated Serving page

CI / build-and-push (push) Successful in 28s

Details

The old overload policy had dead controls (maxQueueDepth, rateLimitPerCustomer never read)
and trivial flat penalties. This replaces it with a full serving pipeline where deployed
models form a fleet, requests route through priority/degradation logic, and policy choices
create meaningful strategic tradeoffs.

New serving pipeline: fleet building from deployed models (size/quant/MoE multipliers),
demand categorization by 5 priority tiers, enterprise capacity reservation, priority-ordered
serving with overflow behaviors (queue/reject/degrade), auto-degradation to faster models
under load, and Batch API to fill idle capacity at discounted rates.

4 new research nodes gate features progressively: Intelligent Request Routing, Priority
Queue System, Request Batching, and Auto-Scaling. New dedicated Serving page with pipeline
metrics, model fleet utilization, and research-gated policy controls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-25 12:42:09 -04:00

3 Commits