Replace decorative overload policy with real serving pipeline and dedicated Serving page
CI / build-and-push (push) Successful in 28s
CI / build-and-push (push) Successful in 28s
The old overload policy had dead controls (maxQueueDepth, rateLimitPerCustomer never read) and trivial flat penalties. This replaces it with a full serving pipeline where deployed models form a fleet, requests route through priority/degradation logic, and policy choices create meaningful strategic tradeoffs. New serving pipeline: fleet building from deployed models (size/quant/MoE multipliers), demand categorization by 5 priority tiers, enterprise capacity reservation, priority-ordered serving with overflow behaviors (queue/reject/degrade), auto-degradation to faster models under load, and Batch API to fill idle capacity at discounted rates. 4 new research nodes gate features progressively: Intelligent Request Routing, Priority Queue System, Request Batching, and Auto-Scaling. New dedicated Serving page with pipeline metrics, model fleet utilization, and research-gated policy controls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
import { useState, useEffect, useRef } from 'react';
|
||||
import {
|
||||
LayoutDashboard, Server, FlaskConical, Brain,
|
||||
TrendingUp, Users, Database, Swords, DollarSign, Settings, Trophy, Medal,
|
||||
TrendingUp, Activity, Users, Database, Swords, DollarSign, Settings, Trophy, Medal,
|
||||
PanelLeftClose, PanelLeftOpen,
|
||||
} from 'lucide-react';
|
||||
import { useGameStore, type ActivePage } from '@/store';
|
||||
@@ -12,6 +12,7 @@ const NAV_ITEMS: { page: ActivePage; label: string; icon: typeof LayoutDashboard
|
||||
{ page: 'research', label: 'Research', icon: FlaskConical },
|
||||
{ page: 'models', label: 'Models', icon: Brain },
|
||||
{ page: 'market', label: 'Market', icon: TrendingUp },
|
||||
{ page: 'serving', label: 'Serving', icon: Activity },
|
||||
{ page: 'finance', label: 'Finance', icon: DollarSign },
|
||||
{ page: 'talent', label: 'Talent', icon: Users, era: 'scaleup' },
|
||||
{ page: 'data', label: 'Data', icon: Database, era: 'scaleup' },
|
||||
|
||||
Reference in New Issue
Block a user