Commit Graph

70 Commits

Author SHA1 Message Date
josh be93e57853 Fix empty VITE_API_URL falling back to localhost in production
Use ?? instead of || so empty string (same-origin) is preserved
while undefined still falls back to localhost:3001 for dev.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 20:50:34 -04:00
josh fbedcec4f2 Fix double /api prefix causing invite gate bypass and broken API calls
VITE_API_URL was /api but all API paths already included /api/,
resulting in /api/api/config etc. Config call failed silently and
defaulted to requireInvite:false. Set VITE_API_URL to empty string
so paths like /api/auth/anonymous go through nginx as-is.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 20:45:13 -04:00
josh 65afa886af Fix health check 404: add /api/health route, remove nginx /health proxy
Frontend API_BASE is /api in production, so health check was hitting
/api/health which didn't exist. Added /api/health on the server and
removed the now-unnecessary separate nginx /health proxy rule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 20:22:00 -04:00
josh 7348b35475 Improve API error messages: show HTTP status, catch network errors
"Unknown error" was hiding the actual HTTP status (likely 502 from
nginx). Now shows "HTTP 502 Bad Gateway" etc. Network TypeErrors
(connection refused) also get a clear message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 20:03:34 -04:00
josh 6cf5bf76b3 Add nginx reverse proxy for /api and /health to backend server
The frontend builds with VITE_API_URL=/api so all API calls target
the same origin. Without a proxy rule, nginx was serving index.html
for API paths, causing JSON parse errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 19:59:32 -04:00
josh 2ab097ec8a Add backend health check, fetch timeouts, stale token cleanup, and error screen
Frontend now checks /health before starting auth flow. Shows a clear
"Cannot Connect to Server" screen with retry button when backend is
unreachable. Stale non-JWT tokens in localStorage are detected and
cleared automatically. All API calls have a 10s timeout via AbortController.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 19:55:50 -04:00
josh 066c3310ff Add auto-migration on server startup
Run Drizzle migrations before seeding admin user so tables exist
on fresh database. Migration files generated from current schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 19:41:10 -04:00
josh a061337d6f Fix Docker production build: move tsx to dependencies, reinstall in production stage
The production Docker stage was copying pnpm symlinks between stages
which broke module resolution. Now does a fresh pnpm install --prod
in the production stage and runs from the server working directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 19:33:16 -04:00
josh 4881907c28 Add auth system with invite-only registration and admin roles
JWT-based auth (hono/jwt + bcrypt), anonymous-first flow preserved.
Registration requires invite code when REQUIRE_INVITE=true. Admin
user seeded on startup (admin/admin, forced password reset). Login
accepts email or username. Admin invitations management page in
sidebar. Regular users get invite-a-friend button when USER_INVITATIONS > 0.
Frontend gate screen blocks game access for unregistered users with
invite code entry, registration, login, and password reset flows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 19:25:16 -04:00
josh df01ac8e35 Move revenue after churn and raise price churn cap to prevent exploit
Balance Check / balance-simulation (push) Successful in 2m1s
CI / build-and-push (push) Successful in 2m6s
Balance Check / multi-run-balance (push) Successful in 13m10s
Churned subscribers no longer generate revenue the tick they leave,
and the price churn multiplier cap is raised from 10 to 1000 so
astronomical prices empty the subscriber pool in a single tick.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 22:17:30 -04:00
josh 63e56dc229 Fix consumer subscription pricing exploit with perceived-value-based elasticity
Balance Check / balance-simulation (push) Successful in 51s
Balance Check / multi-run-balance (push) Successful in 13m19s
CI / build-and-push (push) Successful in 45s
Players could set astronomical prices and still retain subscribers because
price elasticity floored at 10% for any price above $100, satisfaction
ignored pricing entirely, and churn had no price component.

Introduces perceived value per tier (model quality × reputation), replaces
the broken linear formula with sigmoid decay, adds price-aware satisfaction
blending, and applies per-tier price-based churn multipliers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 21:51:03 -04:00
josh 5aa9436368 Expand multirun reporting: health summary, era durations, serving diagnostics, cash-flow detail
Balance Check / balance-simulation (push) Successful in 46s
Balance Check / multi-run-balance (push) Successful in 14m6s
CI / build-and-push (push) Successful in 46s
Propagate per-era duration/bottleneck, serving utilization, cash-flow nadir/peak,
and late-game revenue growth through the worker→CSV→interpret pipeline. Add
simulation health archetype classification, per-era bottleneck frequency,
unused-feature frequency table, failed-run AGI gate analysis, and log-scale
variance for exponential metrics. All new CSV columns parse defensively for
backward compatibility with older summary files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 20:55:49 -04:00
josh 62998d6cb2 Remove duplicate per-run completion line from worker stderr
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 20:21:38 -04:00
josh a240ba2e44 Scale CI multi-simulation to 100 runs, remove per-run progress ticker
Balance Check / balance-simulation (push) Successful in 42s
Balance Check / multi-run-balance (push) Successful in 13m37s
CI / build-and-push (push) Successful in 43s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 20:11:49 -04:00
josh 19f652b43a Replace per-switch network simulation with aggregate per-DC statistical model
Balance Check / balance-simulation (push) Successful in 48s
Balance Check / multi-run-balance (push) Successful in 1m24s
CI / build-and-push (push) Successful in 43s
Eliminates the 22K-object switchRegistry that caused O(n×m) scans 4x per tick.
Network health is now tracked as aggregate counts per tier (totalByTier/healthyByTier)
with RepairBatch timers, cutting late-game tick cost from ~50ms to ~0.3ms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 20:06:40 -04:00
josh 57a81be769 Cache serving pipeline fleet to eliminate per-tick rebuilds and reduce GC pressure
Fleet template is now rebuilt only when deploymentVersion changes (~68 times per
28,800-tick run instead of every tick). Reuses module-level Maps, arrays, and
utilization objects instead of allocating new ones each tick. Replaces 4x
Object.values().reduce() with single-pass aggregation and sorts fleet in-place.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 19:51:13 -04:00
josh bbb69a315c Remove benchmark evaluation system, use training capabilities directly
Model quality for market segments and product lines now derives from deployed
model capabilities (coding, reasoning, agents, etc.) instead of requiring a
separate manual benchmark evaluation step. This eliminates an unbounded
benchmarkResults[] array that was scanned 5x per tick and removes ~480 lines
of dead-weight UI, types, and engine code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 19:28:59 -04:00
josh db034687d6 Add real-time progress feedback to multi-run simulations
Balance Check / balance-simulation (push) Successful in 11m24s
Balance Check / multi-run-balance (push) Successful in 26m35s
CI / build-and-push (push) Successful in 34s
Switch from exec() to spawn() for streaming stderr, add onProgress
callback to runner, and emit per-run progress lines from workers.
CI now shows live percentage, tick count, and era during long runs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 17:42:45 -04:00
josh 04d8a4e883 Update .gitea/workflows/balance-check.yml
CI / build-and-push (push) Successful in 17s
2026-04-26 17:08:45 -04:00
josh 416b6bfe8d Add research money costs, longer research times, era-scaled talent costs, and persona strategy
Balance Check / balance-simulation (push) Successful in 11m19s
Balance Check / multi-run-balance (push) Has been cancelled
CI / build-and-push (push) Successful in 40s
Research now costs money (drained per-tick) with ~2.5-3.5x longer durations by category.
Early-game talent budget costs reduced via era multiplier (startup 0.2x → bigtech 1.0x).
New seed-driven PersonaStrategy with 8 axes of variation for meaningful multi-run testing.
CI multi-run switched from greedy to persona strategy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 16:14:27 -04:00
josh b906592af4 Redesign system interconnections display: group by health, inline diagnoses
Balance Check / balance-simulation (push) Successful in 6m53s
Balance Check / multi-run-balance (push) Successful in 20m40s
CI / build-and-push (push) Successful in 27s
Remove misleading Reputation -> Era Gates connection (score 0 meant
"already sufficient," not broken). Add diagnosis and eventLabel fields
to each connection. Group output: broken links first with [!!] and
plain-language explanation, then healthy links as compact one-liners.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 13:00:41 -04:00
josh d47afd8542 Fix CI interpret step using wrong relative path for summary CSV
CI / build-and-push (push) Successful in 14s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 12:25:10 -04:00
josh 6105c28887 Fix research complete notification using raw ID instead of display name
Balance Check / balance-simulation (push) Successful in 6m37s
Balance Check / multi-run-balance (push) Failing after 20m20s
CI / build-and-push (push) Successful in 33s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 12:22:17 -04:00
josh a8746246f8 Add Vitest test suite with 184 tests covering all game engine systems
Balance Check / balance-simulation (push) Successful in 7m0s
Balance Check / multi-run-balance (push) Failing after 20m5s
CI / build-and-push (push) Successful in 1m18s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 09:41:56 -04:00
josh 1f50f6c86c Fix crash on existing saves missing researchQueue by merging persisted state with defaults
CI / build-and-push (push) Successful in 27s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:41:54 -04:00
josh cc606ae523 Fix balance-check CI: remove node cache, random sim, and unsupported artifact uploads
CI / build-and-push (push) Successful in 12s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:19:40 -04:00
josh 5885e33531 Add research queue: queue multiple projects, auto-promote on completion, RP refund on dequeue
Balance Check / multi-run-balance (push) Has been cancelled
Balance Check / balance-simulation (push) Has been cancelled
CI / build-and-push (push) Successful in 28s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:16:16 -04:00
josh 626ca51041 Fix community size ballooning to infinity with logistic growth damping
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:16:02 -04:00
josh 102e05c8ba Add game-simulation package with multi-run balance testing, fix stalled-pipeline trap
Balance Check / balance-simulation (push) Failing after 11m32s
Balance Check / multi-run-balance (push) Failing after 23m46s
CI / build-and-push (push) Successful in 1m20s
Adds a full simulation harness (game-simulation package) with greedy/random strategies,
36-metric diagnostics, multi-run orchestration via child processes, and a statistical
interpreter. Includes 2.3x engine performance optimizations (research bonus caching,
per-DC dirty tracking, reduced allocations in tick pipeline, single-pass loops).

Fixes a critical balance bug where training pipelines stalled on insufficient VRAM would
permanently block training slots — the engine never re-checked stalled pipelines, and the
greedy strategy didn't pre-check VRAM requirements. This caused 20-25% of seeds to get
stuck in Scale-up era. All three fixes (engine un-stalling, strategy VRAM pre-check,
stalled pipeline cancellation) bring pass rate from 75% to 100% across 20 random seeds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 06:11:26 -04:00
josh 283c7c7932 Overhaul dashboard into command center with compute tracking, era-gated sections
CI / build-and-push (push) Successful in 37s
Add compute history time-series (capacity vs demand chart), revenue vs expenses
dual-line chart, enhanced system status (training allocation, network uptime,
model freshness), active operations panel, market position bars, and competitor
snapshot. Stat cards expand from 3 to 6 as player progresses through eras.
Graceful v9→v10 save migration preserves existing games.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 13:45:16 -04:00
josh 901db02a6b Replace decorative overload policy with real serving pipeline and dedicated Serving page
CI / build-and-push (push) Successful in 28s
The old overload policy had dead controls (maxQueueDepth, rateLimitPerCustomer never read)
and trivial flat penalties. This replaces it with a full serving pipeline where deployed
models form a fleet, requests route through priority/degradation logic, and policy choices
create meaningful strategic tradeoffs.

New serving pipeline: fleet building from deployed models (size/quant/MoE multipliers),
demand categorization by 5 priority tiers, enterprise capacity reservation, priority-ordered
serving with overflow behaviors (queue/reject/degrade), auto-degradation to faster models
under load, and Batch API to fill idle capacity at discounted rates.

4 new research nodes gate features progressively: Intelligent Request Routing, Priority
Queue System, Request Batching, and Auto-Scaling. New dedicated Serving page with pipeline
metrics, model fleet utilization, and research-gated policy controls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 12:42:09 -04:00
josh d7d77238b9 Redesign model lifecycle: upfront SFT/alignment, multi-size families, point releases, quantization-only variants
CI / build-and-push (push) Successful in 45s
Training pipeline now requires SFT specializations and alignment method configured at start — no more
mid-training configuration step. Model families support multiple size tiers (Nano/Small/Medium/Large/Flagship)
trained independently, mimicking real AI company model families. Point releases iterate on deployed models
with 40% training time and 8% capability gain. Distillation and fine-tuning variants removed — players
train smaller size tiers or configure SFT during initial training instead. Only quantization remains as
a variant type.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 11:00:38 -04:00
josh 775c6a4fa5 Overhaul Models tab UX: action feedback, post-training flow, guided navigation
CI / build-and-push (push) Successful in 29s
- Lift modelsTab state into Zustand store so actions can navigate tabs
- Add toast notifications + auto-tab-switch to all 10 model actions
  (train, configure SFT/alignment, distill, fine-tune, quantize, eval, deploy, open-source)
- Add actionable toast buttons with navigation (e.g., "Go to Families" on training complete)
- Fix post-training config: remove 50% deadline, show until pretraining completes,
  always-visible warning prompt outside card expand, engine reminder at 75%
- PostTrainingConfig now hides already-configured sections independently
- Add tab badges: pulsing dot for active jobs, count for undeployed models, warning for no deployment
- Replace empty states with actionable buttons guiding next steps
- Stage bars show "(skip)" in warning color for unconfigured SFT/Alignment stages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 10:20:00 -04:00
josh fdedd6f4d0 Fix model freshness stuck at 0% by calling onModelDeployed on deploy
CI / build-and-push (push) Successful in 27s
onModelDeployed was defined but never invoked — lastModelReleaseTick
stayed at 0 so the freshness guard always returned 0. Now deployModel
updates obsolescence state, setting freshness to 1.0 with proper decay.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 09:51:53 -04:00
josh 00e790591e Game balance audit: wire research effects, rework capability formula, fix dead systems
CI / build-and-push (push) Successful in 32s
- Create researchBonuses utility to aggregate tech tree effects into all game systems
  (infrastructure energy costs, compute efficiency, training speed, model capability, reputation)
- Rework model capability from sqrt(compute) to 4-pillar formula (params + compute + data + research)
- Make context window affect benchmarks and inference speed
- Add MoE tradeoffs: 1.5x VRAM, 0.8x training speed
- Enforce research point costs as a gate for unlocking research
- Add real consequences to data contamination events (reputation hit, legal costs)
- Scale talent costs from $0.03 to $5/tick per headcount
- Scale compliance costs 100x to be meaningful
- Rework competitor acquisition: cheaper but grants headcount, RP, and reputation
- Remove dead code: sfxVolume, autoSaveInterval, notificationsEnabled,
  FAST_FORWARD_BATCH_SIZE, CHINCHILLA_OPTIMAL_RATIO

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 09:36:31 -04:00
josh 8d650fefae Comprehensive UX audit fixes: navigation, feedback, affordances, and accessibility
CI / build-and-push (push) Successful in 28s
Address 18 issues across high/medium/low impact tiers identified in a full
interface review. Key changes: Models page decomposed into tabs, confirmation
dialogs for irreversible actions (deploy/open-source/acquire), chart Y-axes
made visible, hash router extended for Market tab persistence, collapsible
sidebar, keyboard navigation shortcuts (g+key chords), notification bulk
actions, achievement progress bars, and ARIA label improvements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 09:05:26 -04:00
josh 09a5cb69a7 Overhaul market system with shared TAM competition, multi-tier pricing, enterprise pipeline, and developer ecosystem
CI / build-and-push (push) Successful in 42s
Replaces the simplified single-subscriber market with a full competitive simulation:
shared TAM with softmax market shares across 4 segments, multi-tier consumer
subscriptions (Free/Plus/Pro/Team) and API tiers (Free/PAYG/Scale/Enterprise),
enterprise sales pipeline (Lead→Qualification→POC→Negotiation→Active→Renewal)
with SLA tracking, developer ecosystem flywheel, technology obsolescence pressure,
seasonal demand cycles, and two new product lines (Code Assistant, AI Agents Platform).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 08:30:24 -04:00
josh 4c1c0e9ff2 Overhaul model system with multi-stage training, variants, benchmarks, and eval
CI / build-and-push (push) Successful in 32s
Replace the single-stage training + flat capability score with a realistic AI
development pipeline: pre-training with Chinchilla scaling laws, SFT with
specializations, alignment with safety/capability tradeoffs (RLHF/DPO/Constitutional),
model families with distillation/fine-tuning/quantization variants, named benchmark
suite with compute-costing eval jobs, and segment-specific market quality.

Phases 1-6 of the model rework plan: new types, engine rewrite, save migration,
training events/risk system, concurrent training, variant creation, benchmark
evaluation with leaderboard, and market integration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 07:36:34 -04:00
josh fc1f371c8c Overhaul rack system with split FLOPS, VRAM, cooling, interconnect, and multi-vendor SKUs
CI / build-and-push (push) Successful in 29s
Expand from 10 to 18 rack SKUs across NVIDIA, AMD, and custom ASIC vendors, each with
distinct training vs inference FLOPS, VRAM capacity, cooling requirements, and interconnect
technology. Adds cooling hierarchy (air/liquid/immersion) that gates rack deployment, VRAM
requirements that gate model training by generation, interconnect multipliers for distributed
training scaling, and PUE-based energy cost reduction for advanced cooling. Includes save
migration from v4 to v5, 6 new research nodes, and UI updates showing split compute stats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 02:27:03 -04:00
josh 54220fca70 Rework network to 6-tier Clos topology with individual switch entities
CI / build-and-push (push) Successful in 31s
Replace aggregate network health stats with a full 6-tier Clos topology
(ToR → T1 → T2 → T3 → T4 → T5) where every switch is an individually
tracked entity with uplinks, repair pipelines, and failure cascades.

Key mechanics:
- Bottleneck bandwidth model (min along path) affects FLOPS and satisfaction
- Rackdown on full disconnect → racks re-enter testing pipeline on recovery
- Binomial failure sampling per tier, dirty-flag cascade optimization
- Flat switch registry for performance at scale
- Three new research nodes: network-redundancy, fast-repair, hot-standby

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 01:33:59 -04:00
josh f8d7a25c6e Remove per-DC deployment complete and rack QA failure notifications
CI / build-and-push (push) Successful in 36s
These fire constantly at scale with thousands of racks, flooding the
notification panel with noise.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 00:39:35 -04:00
josh 4a318c36ad Add bulk fill and staggered retrofit at campus/cluster level
CI / build-and-push (push) Successful in 40s
Campus level: "Fill All DCs" instantly fills all operational DCs with
selected SKU in one click. "Retrofit Campus" queues a staggered retrofit
with configurable concurrency (1/10%/25%/custom) so only a fraction of
DCs go offline at a time, preserving capacity during the upgrade.

Cluster level: "Fill All DCs" fills across all campuses in one action.

The game engine automatically advances the retrofit queue each tick,
promoting pending DCs as active ones complete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 00:26:14 -04:00
josh 02791f9500 Remove per-rack production failure notifications
CI / build-and-push (push) Successful in 42s
These spam the notification feed at scale with thousands of racks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 00:02:04 -04:00
josh b7d23c8872 Fix Fill Capacity exceeding DC slot limit due to double-counted failed racks
CI / build-and-push (push) Successful in 38s
computeRacksFailed was incremented on production failure and never decremented
when repaired racks came back online, while repair cohorts also tracked the
same racks. This caused usedSlots to inflate past the DC capacity over time.

Fix: derive computeRacksFailed from repair cohorts each tick instead of
maintaining it as a running counter. Include repair cohorts in pipeline slot
accounting so all racks are counted exactly once. Also fixes power limit in
fillDCToCapacity to only count online racks (pipeline racks don't draw power).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 00:00:46 -04:00
josh 9c49a10b31 Add floating dev/debug menu for QA testing (Ctrl+D)
CI / build-and-push (push) Successful in 30s
Four-tab panel with resource manipulation, time controls, state inspection,
and event triggers to accelerate testing across all game systems.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 23:43:41 -04:00
josh c799f2e359 Redesign infrastructure to hypercluster scale with 4-level hierarchy
CI / build-and-push (push) Successful in 43s
Replace flat DataCenter/Rack model with Cluster > Campus > Data Center > Racks
hierarchy. Individual rack entities eliminated in favor of statistical batch
simulation using deployment cohorts. Adds tiered network topology (ToR/agg/core)
with proportional outage model, DC retrofitting, bulk operations, and drill-down
UI navigation with breadcrumbs. First cluster and campus are free to preserve
early game flow. Rebalances starting economy ($600K), funding rounds, and
cohort scaling for hypercluster-scale gameplay.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 23:15:41 -04:00
josh d36d9d61a8 Fix DC uptime always showing 100% despite rack failures
CI / build-and-push (push) Successful in 34s
Failed racks were removed from dc.racks in Phase 3 before uptime was
calculated in Phase 4, so healthyCount always equaled totalInDc. Now
counts racks in the repair pipeline as down capacity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 22:22:42 -04:00
josh f702a82539 Redesign Infrastructure page with AWS-style provisioning UX
CI / build-and-push (push) Successful in 37s
Replace basic admin panel with cloud console-style interactions:
- Fleet summary bar with aggregate stats (DCs, racks, FLOPS, uptime, cost)
- Launch Racks provisioning panel with SKU table, quantity stepper, live
  cost sidebar, and review-before-launch flow
- Rack inventory table with sortable columns, checkbox multi-select,
  and bulk decommission with confirmation modal
- Pipeline kanban grouping (same SKU/stage collapsed) with ETA display
- Tabbed DC cards (Inventory | Launch | Upgrades) to reduce scroll
- Build DC panel with cost breakdown and capacity preview

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 22:17:48 -04:00
josh f9f6233b69 Comprehensive UX polish: fix 19 friction points across all pages
CI / build-and-push (push) Successful in 33s
Addresses broken interactions (notification bell, browser dialogs),
missing feedback states (disabled buttons, pricing changes, paused
indicator), unclear affordances (research queue, model tuning, funding
requirements), and navigation gaps (hash routing, keyboard shortcuts,
clickable dashboard cards, sidebar grouping, tutorial hints).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 21:44:18 -04:00
josh d25dfe0435 Fix max update depth crash on Market page
CI / build-and-push (push) Successful in 32s
The deployedModels selector used .filter() which created a new array
reference on every store change, triggering cascading re-renders.
Changed to compute bestQuality as a primitive directly in the selector.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 20:55:27 -04:00