Replace decorative overload policy with real serving pipeline and dedicated Serving page
CI / build-and-push (push) Successful in 28s
CI / build-and-push (push) Successful in 28s
The old overload policy had dead controls (maxQueueDepth, rateLimitPerCustomer never read) and trivial flat penalties. This replaces it with a full serving pipeline where deployed models form a fleet, requests route through priority/degradation logic, and policy choices create meaningful strategic tradeoffs. New serving pipeline: fleet building from deployed models (size/quant/MoE multipliers), demand categorization by 5 priority tiers, enterprise capacity reservation, priority-ordered serving with overflow behaviors (queue/reject/degrade), auto-degradation to faster models under load, and Batch API to fill idle capacity at discounted rates. 4 new research nodes gate features progressively: Intelligent Request Routing, Priority Queue System, Request Batching, and Auto-Scaling. New dedicated Serving page with pipeline metrics, model fleet utilization, and research-gated policy controls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -5,6 +5,7 @@ import type {
|
||||
EnterpriseSegment,
|
||||
EnterprisePipelineStage,
|
||||
DeveloperEcosystem,
|
||||
TierServingMetrics,
|
||||
} from '@ai-tycoon/shared';
|
||||
import {
|
||||
BASE_LEAD_RATE,
|
||||
@@ -17,6 +18,7 @@ import {
|
||||
ENTERPRISE_SLA_REQUIREMENTS,
|
||||
ENTERPRISE_CAPABILITY_REQUIREMENTS,
|
||||
ENTERPRISE_TOKENS_PER_TICK,
|
||||
ENTERPRISE_REJECTION_SLA_MULTIPLIER,
|
||||
} from '@ai-tycoon/shared';
|
||||
import { ENTERPRISE_NAMES } from '../../data/enterpriseNames';
|
||||
|
||||
@@ -62,7 +64,7 @@ export function processEnterprisePipeline(
|
||||
devEcosystem: DeveloperEcosystem,
|
||||
seasonalEntMultiplier: number,
|
||||
currentTick: number,
|
||||
demandCapacityRatio: number,
|
||||
enterpriseServingMetrics: TierServingMetrics,
|
||||
): EnterprisePipelineResult {
|
||||
const pipeline = [...ent.pipeline];
|
||||
const activeContracts = [...ent.activeContracts];
|
||||
@@ -129,7 +131,10 @@ export function processEnterprisePipeline(
|
||||
if (lead.stage === 'qualification') {
|
||||
transitionProb *= modelCapability >= lead.requiredCapability ? 1 : 0.1;
|
||||
} else if (lead.stage === 'poc') {
|
||||
transitionProb *= Math.max(0.2, 1 - Math.max(0, demandCapacityRatio - 0.9) * 5);
|
||||
const entDemand = enterpriseServingMetrics.demandTokens;
|
||||
const entRejected = enterpriseServingMetrics.rejectedTokens;
|
||||
const rejectRate = entDemand > 0 ? entRejected / entDemand : 0;
|
||||
transitionProb *= Math.max(0.2, 1 - rejectRate * 5);
|
||||
} else if (lead.stage === 'negotiation') {
|
||||
transitionProb *= Math.max(0.3, 1 - (lead.dealValue / 10_000_000) * 0.5);
|
||||
}
|
||||
@@ -181,14 +186,22 @@ export function processEnterprisePipeline(
|
||||
const updated = { ...contract };
|
||||
updated.totalTicks++;
|
||||
|
||||
if (demandCapacityRatio <= (1 / updated.slaUptime)) {
|
||||
const entDemand = enterpriseServingMetrics.demandTokens;
|
||||
const entServed = enterpriseServingMetrics.servedTokens;
|
||||
const entRejected = enterpriseServingMetrics.rejectedTokens;
|
||||
const servedFraction = entDemand > 0 ? entServed / entDemand : 1;
|
||||
const wasRejected = entRejected > 0;
|
||||
const qualityMet = enterpriseServingMetrics.avgQualityDelivered >= 0.85;
|
||||
|
||||
if (servedFraction >= updated.slaUptime && qualityMet && !wasRejected) {
|
||||
updated.uptimeTicks++;
|
||||
} else {
|
||||
updated.slaViolations++;
|
||||
const penalty = updated.pricePerMToken * (updated.tokensPerTick / 1_000_000) * SLA_PENALTY_FRACTION;
|
||||
const severityMultiplier = wasRejected ? ENTERPRISE_REJECTION_SLA_MULTIPLIER : 1.0;
|
||||
const penalty = updated.pricePerMToken * (updated.tokensPerTick / 1_000_000) * SLA_PENALTY_FRACTION * severityMultiplier;
|
||||
slaPenalties += penalty;
|
||||
updated.slaPenaltiesPaid += penalty;
|
||||
updated.satisfaction = Math.max(0, updated.satisfaction - 0.005);
|
||||
updated.satisfaction = Math.max(0, updated.satisfaction - (wasRejected ? 0.01 : 0.005));
|
||||
}
|
||||
|
||||
if (updated.totalTicks > 0 && updated.slaViolations === 0) {
|
||||
|
||||
Reference in New Issue
Block a user