diff --git a/README.md b/README.md index e50075e..0fe3582 100644 --- a/README.md +++ b/README.md @@ -52,6 +52,20 @@ The park detail page shows ride open/closed status using a two-tier approach: 2. **Schedule fallback (Six Flags API)** — when Queue-Times data is unavailable, the app falls back to the nearest upcoming date from the Six Flags schedule API as an approximation. +### Fast Lane wait times + +A second wait number is fetched from Six Flags' `/wait-times/park/{apiId}` endpoint and joined onto each ride by name. The park page has a **Fast Lane** toggle (persisted in `localStorage.fastLaneMode`) that swaps the displayed wait between regular and Fast Lane. On the today chart, Fast Lane appears as a second line. + +### Per-ride history + +Click any ride name on a park page to open `/park/[id]/ride/[slug]` — a detail page with three tabs: + +- **Today** — 5-minute wait-time samples (regular + Fast Lane) with outage markers +- **7 days** — daily average / max wait and uptime percentage +- **30 days** — same aggregates over a longer window + +Samples are stored in the `ride_wait_samples` table by a Tier-5 cron job that runs every 5 minutes for parks currently within their operating window. Contiguous "ride closed during park hours" runs are shaded on the today chart with a `#N — Hh Mm` label. + ### Roller Coaster Filter When live data is shown, a **Coasters only** toggle filters to roller coasters. Coaster lists are hardcoded in `lib/coaster-data.ts`. @@ -66,8 +80,9 @@ The backend runs a tiered scraping schedule via node-cron: | 2 | Every 6 hours | Current month for all parks | | 3 | Twice daily (3 AM, 3 PM) | Current + next month | | 4 | Daily at 3 AM | Full year (respects 72h staleness window) | +| 5 | Every 5 minutes | Wait-time samples for all currently-open parks (writes `ride_wait_samples`) | -Past dates are never overwritten. The hourly tier compares live data against the database before writing — unchanged data is skipped. +Past dates are never overwritten. The hourly tier compares live data against the database before writing — unchanged data is skipped. Each tier has its own concurrency latch — if a tick is still running when the next would fire, the new tick is skipped and logged rather than stacked. A manual trigger is available via the backend API: @@ -152,7 +167,7 @@ See [`.env.example`](.env.example) for the full list and defaults. | `PORT` | `3001` | Port the Hono server listens on. | | `TZ` | `UTC` | Timezone for cron schedules (e.g. `America/New_York`). | | `PARK_HOURS_STALENESS_HOURS` | `72` | Hours before park schedule data is re-fetched. | -| `RATE_LIMIT_PER_MIN` | `60` | Per-IP request limit for the public API, per minute. | +| `RATE_LIMIT_PER_MIN` | `60` | Per-IP request limit for the public API, per minute. Enforced by `backend/src/middleware/rate-limit.ts`; over-limit requests get a `429` with a `Retry-After` header. | ### Updating @@ -167,6 +182,7 @@ docker compose pull && docker compose up -d | `GET /api/calendar/week?start=YYYY-MM-DD` | Week calendar for all parks | | `GET /api/calendar/:parkId/month?month=YYYY-MM` | Month calendar for one park | | `GET /api/parks/:id/rides` | Live rides or schedule fallback | +| `GET /api/parks/:id/rides/:slug` | Per-ride detail + today/7d/30d wait-time history | | `GET /api/parks` | Park list with metadata | | `GET /api/status` | Health check, scrape timestamps, DB stats | | `POST /api/scrape/trigger?scope=...` | Manual scrape trigger | diff --git a/docs/API.md b/docs/API.md index 19206c9..662ada9 100644 --- a/docs/API.md +++ b/docs/API.md @@ -14,6 +14,20 @@ None. All endpoints are public and unauthenticated. The scrape trigger endpoint is also unprotected -- restrict access at the network/proxy level if needed. +## Rate limiting + +Every endpoint is gated by a fixed-window per-IP counter (`backend/src/middleware/rate-limit.ts`). + +| Header / Body | Value | +|---------------|-------| +| Limit | `RATE_LIMIT_PER_MIN` env var, default `60` requests/minute | +| Window | 60 seconds, per client IP (resolved via `x-forwarded-for` → `x-real-ip` → socket address) | +| Over-limit response | `429 Too Many Requests` | +| Body | `{ "error": "Too many requests" }` | +| Response header | `Retry-After: ` — how long until the window resets | + +Behind a reverse proxy, make sure `x-forwarded-for` is set or every request will appear to come from the proxy's own IP. + --- ## Endpoints @@ -226,15 +240,21 @@ Returns live ride status or schedule fallback for a park. "rides": [ { "name": "Steel Vengeance", + "slug": "steel-vengeance", "isOpen": true, "waitMinutes": 45, + "fastLaneMinutes": 10, + "hasFastLane": true, "lastUpdated": "2026-04-23T18:30:00.000Z", "isCoaster": true }, { "name": "Millennium Force", + "slug": "millennium-force", "isOpen": false, "waitMinutes": 0, + "fastLaneMinutes": null, + "hasFastLane": true, "lastUpdated": "2026-04-23T18:30:00.000Z", "isCoaster": true } @@ -245,6 +265,8 @@ Returns live ride status or schedule fallback for a park. } ``` +Each ride is enriched from two sources: Queue-Times.com supplies `isOpen` and the base `waitMinutes`, then Six Flags' wait-times feed is joined by name to fill in `fastLaneMinutes` and `hasFastLane`. When both sources have a regular wait for the same ride, the Six Flags value wins (Queue-Times lags around park open). `fastLaneMinutes` is `null` when the ride is closed or has no Fast Lane line. `slug` is the URL-safe identifier used by `/api/parks/:id/rides/:slug`. + **Response fields:** | Field | Type | Description | @@ -270,6 +292,101 @@ Returns live ride status or schedule fallback for a park. --- +### GET /api/parks/:parkId/rides/:slug + +Returns metadata + history for a single ride: today's 5-minute wait samples and daily aggregates over the last 7 and 30 calendar days. Everything ships in one round-trip — the frontend renders the Today / 7d / 30d tabs from this single payload. + +**Path Parameters:** + +| Param | Description | +|-------|-------------| +| `parkId` | Park identifier (e.g. `cedarpoint`) | +| `slug` | Ride slug, as returned in `LiveRide.slug` or stored in the `rides.slug` column | + +**Cache:** `Cache-Control: public, max-age=60, stale-while-revalidate=120` + +**Response:** + +```json +{ + "park": { + "id": "cedarpoint", + "name": "Cedar Point", + "shortName": "Cedar Point", + "timezone": "America/New_York" + }, + "ride": { + "qtRideId": 257, + "slug": "steel-vengeance", + "name": "Steel Vengeance", + "isCoaster": true, + "hasFastLane": true, + "firstSeen": "2026-03-15T14:05:00.000Z", + "lastSeen": "2026-04-23T18:35:00.000Z" + }, + "live": { + "isOpen": true, + "waitMinutes": 45, + "hasFastLane": true, + "fastLaneMinutes": 10, + "lastUpdated": "2026-04-23T18:30:00.000Z" + }, + "todayLocal": "2026-04-23", + "today": [ + { + "recordedAt": "2026-04-23T14:05:12.000Z", + "localTime": "10:05", + "isOpen": true, + "waitMinutes": 15, + "fastLaneMinutes": 5 + } + ], + "last7d": [ + { + "localDate": "2026-04-17", + "avgWait": 38.4, + "maxWait": 90, + "avgFastLane": 9.1, + "maxFastLane": 25, + "uptimePct": 0.94, + "sampleCount": 132 + } + ], + "last30d": [], + "coverage": { + "daysWith7d": 6, + "daysWith30d": 23, + "todaySampleCount": 1 + } +} +``` + +**Response fields:** + +| Field | Type | Description | +|-------|------|-------------| +| `park` | `{ id, name, shortName, timezone }` | Park identity (timezone is the IANA tz used for sample bucketing) | +| `ride` | `RideRecord` | Canonical row from the `rides` table | +| `live` | `LiveRideSummary \| null` | Best-effort current state pulled from the shared in-memory cache. No upstream fetch — populated by the rides route and Tier-5 sampler. `null` if no recent observation exists. | +| `todayLocal` | `string` | Today's date in the park's timezone | +| `today` | `DailySample[]` | Per-sample series for `todayLocal`, ordered by `recordedAt` | +| `last7d` | `DailyAggregate[]` | One row per `local_date` over the last 7 calendar days (inclusive of today) | +| `last30d` | `DailyAggregate[]` | Same aggregates over 30 days | +| `coverage.daysWith7d` | `number` | Distinct dates with samples in the 7-day window — use to gate the 7d tab | +| `coverage.daysWith30d` | `number` | Distinct dates with samples in the 30-day window | +| `coverage.todaySampleCount` | `number` | Number of samples already collected today | + +`DailySample` and `DailyAggregate` shapes are listed under [Data Types](#data-types). + +**Errors:** + +| Status | Body | Condition | +|--------|------|-----------| +| 404 | `{ "error": "Park not found" }` | Unknown park ID | +| 404 | `{ "error": "Ride not found or no history yet" }` | Slug doesn't match any row in `rides` for this park (Tier-5 hasn't seen the ride yet, or the slug is wrong) | + +--- + ### GET /api/status Health check endpoint with database statistics. @@ -394,11 +511,14 @@ A single ride from the Queue-Times.com API. ```typescript interface LiveRide { - name: string; // Ride display name - isOpen: boolean; // Currently operating - waitMinutes: number; // Current wait time (0 if closed) - lastUpdated: string; // ISO 8601 timestamp from Queue-Times - isCoaster: boolean; // Classified as a roller coaster via RCDB data + name: string; // Ride display name + slug: string; // URL-safe slug for /api/parks/:id/rides/:slug + isOpen: boolean; // Currently operating + waitMinutes: number; // Current regular wait (0 if closed) + fastLaneMinutes?: number | null; // Fast Lane wait (null when closed or no Fast Lane line) + hasFastLane?: boolean; // Ride has a Fast Lane offering per Six Flags + lastUpdated: string; // ISO 8601 timestamp from Queue-Times + isCoaster: boolean; // Classified as a roller coaster via RCDB data } ``` @@ -432,6 +552,36 @@ interface RideStatus { } ``` +### DailySample + +A single wait-time observation recorded by the Tier-5 sampler. + +```typescript +interface DailySample { + recordedAt: string; // ISO 8601 UTC timestamp + localTime: string; // HH:MM in the park's timezone + isOpen: boolean; // Ride open at this sample + waitMinutes: number | null; // Regular wait, null when unobserved + fastLaneMinutes: number | null; // Fast Lane wait, null when no Fast Lane or unobserved +} +``` + +### DailyAggregate + +Per-day statistics computed in SQL from `ride_wait_samples`. Only open samples contribute to wait averages. + +```typescript +interface DailyAggregate { + localDate: string; // YYYY-MM-DD in the park's timezone + avgWait: number | null; // Mean wait_minutes across open samples + maxWait: number | null; // Highest wait_minutes across open samples + avgFastLane: number | null; // Mean fast_lane_minutes across open samples + maxFastLane: number | null; // Highest fast_lane_minutes across open samples + uptimePct: number; // Fraction of samples with is_open=1 (0..1) + sampleCount: number; // Total samples for the day +} +``` + ### ScrapeResult Result of a scraping operation. diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 1e9f73d..49cda3d 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -69,8 +69,12 @@ The web container reaches the backend via Docker internal networking (`http://ba ├── app/ # Next.js App Router │ ├── page.tsx # Home page (week calendar, server component) │ ├── park/[id]/page.tsx # Park detail page (month calendar + rides) +│ ├── park/[id]/error.tsx # Per-route error boundary +│ ├── park/[id]/ride/[slug]/page.tsx # Ride detail + history page │ ├── layout.tsx # Root layout with metadata │ ├── loading.tsx # Skeleton UI for streaming/suspense +│ ├── error.tsx # Top-level error boundary (client) +│ ├── not-found.tsx # 404 page │ └── globals.css # Tailwind v4 theme + custom CSS variables │ ├── components/ # React components @@ -79,11 +83,15 @@ The web container reaches the backend via Docker internal networking (`http://ba │ ├── MobileCardList.tsx # Mobile card layout │ ├── ParkCard.tsx # Individual park card │ ├── ParkMonthCalendar.tsx # Month grid for park detail page -│ ├── LiveRidePanel.tsx # Live ride status with wait times (client) +│ ├── LiveRidePanel.tsx # Live ride status with wait times + Fast Lane toggle (client) │ ├── WeekNav.tsx # Week navigation arrows (client) │ ├── Legend.tsx # Status color legend │ ├── EmptyState.tsx # Shown when no data is scraped -│ └── BackToCalendarLink.tsx # Navigation helper (client) +│ ├── BackToCalendarLink.tsx # Navigation helper (client) +│ └── charts/ # Recharts-based charts (client components) +│ ├── WaitTimeTodayChart.tsx # Today's 5-min samples with outage shading +│ ├── WeeklyStatsChart.tsx # 7d / 30d daily aggregates +│ └── UptimePill.tsx # Compact uptime % indicator │ ├── lib/ # Shared code (imported by both frontend and backend) │ ├── types.ts # Core DayData interface @@ -92,32 +100,46 @@ The web container reaches the backend via Docker internal networking (`http://ba │ ├── coaster-data.ts # Static RCDB coaster name sets per park │ ├── coaster-match.ts # Fuzzy name matching (normalize, prefix, compact) │ ├── queue-times-map.ts # Park ID -> Queue-Times.com park ID mapping +│ ├── api.ts # apiFetch() helper (revalidate vs. no-store option) +│ ├── outage.ts # computeOutages() — contiguous-closed-run detection +│ ├── ride-slug.ts # slugifyRideName() — URL slug for ride pages +│ ├── timezone.ts # formatLocalDate / formatLocalTime in a park's tz │ └── scrapers/ -│ ├── sixflags.ts # Six Flags CloudFront API client -│ ├── queuetimes.ts # Queue-Times.com API client -│ └── types.ts # Park, DayStatus, MonthCalendar, ScraperAdapter interfaces +│ ├── sixflags.ts # Six Flags CloudFront operating-hours client +│ ├── sixflags-waittimes.ts # Six Flags Fast Lane wait-times client +│ ├── queuetimes.ts # Queue-Times.com API client +│ ├── log.ts # Shared scraper logger +│ └── types.ts # Park, DayStatus, MonthCalendar, ScraperAdapter interfaces │ ├── backend/ # Hono API server (separate package.json) │ ├── src/ -│ │ ├── index.ts # Entry point: middleware, routes, DB init, scheduler start +│ │ ├── index.ts # Entry point: middleware, routes, DB init, scheduler start, graceful shutdown +│ │ ├── config.ts # Env-validated config object (fails fast on bad input) +│ │ ├── log.ts # Structured logger (`[ISO] [LEVEL] [tag] msg key=value`) │ │ ├── db/ -│ │ │ ├── index.ts # SQLite connection, schema creation, WAL mode -│ │ │ └── queries.ts # All SQL queries (upsert, date range, staleness) +│ │ │ ├── index.ts # SQLite connection, schema for park_days / rides / ride_wait_samples, WAL mode +│ │ │ └── queries.ts # All SQL queries (upsert, date range, staleness, samples, aggregates) +│ │ ├── middleware/ +│ │ │ └── rate-limit.ts # Fixed-window per-IP limiter (honours x-forwarded-for) │ │ ├── routes/ │ │ │ ├── calendar.ts # /api/calendar/* -- week and month data with live merging │ │ │ ├── parks.ts # /api/parks/* -- park metadata -│ │ │ ├── rides.ts # /api/parks/:id/rides -- live rides + schedule fallback +│ │ │ ├── rides.ts # /api/parks/:id/rides -- live rides + Fast Lane + schedule fallback +│ │ │ ├── ride-history.ts # /api/parks/:id/rides/:slug -- ride detail + today/7d/30d history │ │ │ ├── status.ts # /api/status -- health check │ │ │ └── scrape.ts # /api/scrape/trigger -- manual scrape │ │ └── services/ -│ │ ├── scheduler.ts # Four-tier cron job registration +│ │ ├── scheduler.ts # Five-tier cron jobs with per-tier concurrency latches │ │ ├── scraper.ts # Scraping orchestration (today, month, full year) +│ │ ├── wait-sampler.ts # Tier-5: 5-min wait-time sampling into ride_wait_samples +│ │ ├── live-cache.ts # Shared TtlCaches (liveRidesCache, fastLaneCache, todayCache) │ │ └── cache.ts # Generic TtlCache class +│ ├── tests/ # Backend Node test runner suite │ ├── data/ # SQLite database (parks.db, auto-created) │ ├── package.json # Backend dependencies │ └── tsconfig.json # Backend TypeScript config (CommonJS, rootDir: ..) │ -├── tests/ # Unit tests (Node built-in test runner) +├── tests/ # Frontend unit tests (Node built-in test runner) ├── scripts/ # Debug utility ├── public/ # Static assets ├── Dockerfile # Multi-stage build (web + backend targets) @@ -212,9 +234,9 @@ When the requested week includes today, the `/api/calendar/week` route enhances 2. **Live ride counts** -- For each park that is currently within its operating window (determined by `isWithinOperatingWindow()`), fetches live ride data from Queue-Times.com via `fetchLiveRides()`. Counts open rides and open coasters. Results cached in `ridesCache` (5-min TTL). 3. **Status detection:** - - **Weather delay**: Park is within its scheduled operating window, but _all_ rides report `isOpen: false`. Indicated with a blue badge. - - **Closing**: Current time is past the scheduled close but within a 1-hour wind-down buffer. Determined by `getOperatingStatus()` returning `"closing"`. - - **Open**: Within the scheduled open-to-close window. + - **Open**: Within the scheduled open-to-close window. `getOperatingStatus()` returns `"open"`. + - **Closing**: Current time is past the scheduled close but within a 1-hour wind-down buffer. `getOperatingStatus()` returns `"closing"`. + - **Weather delay**: `getOperatingStatus()` is `"open"` _and_ every reported ride has `isOpen: false`. Indicated with a blue badge. The badge is intentionally suppressed during the `"closing"` wind-down — all-rides-closed near close is normal end-of-day behavior, not weather. Logic lives at [backend/src/routes/rides.ts:96-100](../backend/src/routes/rides.ts) and [backend/src/routes/calendar.ts](../backend/src/routes/calendar.ts). The 3 AM switchover in `getTodayLocal()` prevents the calendar from flipping to the next day at midnight -- before 3 AM local time, the system still considers it "yesterday", since park visitors may still be out. @@ -228,16 +250,19 @@ The system uses three layers of caching, each serving a different purpose: Layer 1: Next.js ISR Layer 2: Backend In-Memory Layer 3: Database Staleness (serves stale while revalidating) (prevents redundant API calls) (controls scrape frequency) ┌───────────────────────────────┐ ┌───────────────────────────────┐ ┌───────────────────────────────┐ - │ Cache-Control response headers│ │ TtlCache (5 min default) │ │ isMonthScraped() query │ + │ Cache-Control response headers│ │ TtlCache (5 min TTL) │ │ isMonthScraped() query │ │ + Next.js fetch revalidate │ │ │ │ MAX(scraped_at) vs staleness │ - │ │ │ todayCache: live park hours │ │ threshold (default 72h) │ - │ week: 120s / 300s SWR │ │ ridesCache: ride/coaster │ │ │ - │ month: 300s / 600s SWR │ │ open counts │ │ Past months auto-skipped │ - │ rides: 60s / 120s SWR │ │ liveRidesCache: full ride │ │ "force" scope bypasses check │ - │ parks: 3600s │ │ data per park │ │ │ + │ │ │ todayCache: routes/calendar│ │ threshold (default 72h) │ + │ week: 120s / 300s SWR │ │ (live park hours per park) │ │ │ + │ month: 300s / 600s SWR │ │ liveRidesCache, fastLaneCache:│ │ Past months auto-skipped │ + │ rides: 60s / 120s SWR │ │ services/live-cache.ts — │ │ "force" scope bypasses check │ + │ parks: 3600s │ │ shared by rides routes + │ │ │ + │ │ │ the Tier-5 sampler │ │ │ └───────────────────────────────┘ └───────────────────────────────┘ └───────────────────────────────┘ ``` +**Live-page Data Cache bypass.** The park detail page (`app/park/[id]/page.tsx`) and ride detail page (`app/park/[id]/ride/[slug]/page.tsx`) fetch their live ride data with `cache: "no-store"` via [`apiFetch`](../lib/api.ts) (`{ noStore: true }`). Earlier revisions used Next.js ISR for these too, but the Data Cache served stale ride state after idle periods — navigation back to a park would show ride statuses from hours ago. Backend HTTP cache headers still allow the upstream Hono server to return cached responses for 60s, so this is a "skip the Next.js Data Cache" change, not a "skip all caching" change. The home calendar page keeps its ISR revalidation since its data is intrinsically slower-moving. + **Per-route HTTP cache headers:** | Endpoint | `max-age` | `stale-while-revalidate` | @@ -247,6 +272,7 @@ The system uses three layers of caching, each serving a different purpose: | `/api/parks` | 3600s | -- | | `/api/parks/:id` | 3600s | -- | | `/api/parks/:id/rides` | 60s | 120s | +| `/api/parks/:id/rides/:slug` | 60s | 120s | --- @@ -254,7 +280,10 @@ The system uses three layers of caching, each serving a different purpose: ### Schema +The database has three tables: `park_days` (calendar hours), `rides` (per-ride metadata), and `ride_wait_samples` (time-series wait data). + ```sql +-- Park operating hours, keyed by park and date. CREATE TABLE IF NOT EXISTS park_days ( park_id TEXT NOT NULL, -- matches Park.id from lib/parks.ts (e.g. "cedarpoint") date TEXT NOT NULL, -- ISO date: YYYY-MM-DD @@ -264,11 +293,42 @@ CREATE TABLE IF NOT EXISTS park_days ( scraped_at TEXT NOT NULL, -- ISO timestamp of when this row was written PRIMARY KEY (park_id, date) ); + +-- Per-ride canonical record. PK is (park_id, qt_ride_id) so ride renames +-- don't fragment history — the slug just provides pretty URLs. +CREATE TABLE IF NOT EXISTS rides ( + park_id TEXT NOT NULL, + qt_ride_id INTEGER NOT NULL, -- Queue-Times ride ID (stable upstream) + slug TEXT NOT NULL, -- URL slug (rebuilt if name changes) + name TEXT NOT NULL, -- Display name as last seen + is_coaster INTEGER NOT NULL DEFAULT 0, + has_fast_lane INTEGER NOT NULL DEFAULT 0, + first_seen TEXT NOT NULL, + last_seen TEXT NOT NULL, + PRIMARY KEY (park_id, qt_ride_id) +); +CREATE UNIQUE INDEX IF NOT EXISTS idx_rides_slug ON rides (park_id, slug); + +-- Time-series wait samples written by Tier-5 every 5 minutes for currently +-- open parks. `recorded_at` is UTC; `local_date` / `local_time` are bucketed +-- in the park's IANA timezone at insert time so reads are pure SQL and DST-safe. +CREATE TABLE IF NOT EXISTS ride_wait_samples ( + park_id TEXT NOT NULL, + qt_ride_id INTEGER NOT NULL, + recorded_at TEXT NOT NULL, -- ISO UTC + local_date TEXT NOT NULL, -- YYYY-MM-DD in park tz + local_time TEXT NOT NULL, -- HH:MM in park tz + is_open INTEGER NOT NULL, + wait_minutes INTEGER, -- Regular line wait + fast_lane_minutes INTEGER, -- Six Flags Fast Lane wait, if known + PRIMARY KEY (park_id, qt_ride_id, recorded_at) +); ``` -- **Composite primary key** `(park_id, date)` ensures one row per park per day and supports efficient queries without secondary indexes. +- **Composite primary keys** ensure one row per logical unit (per park-day, per ride, per sample) and support efficient queries without secondary indexes. `idx_rides_slug` lets the ride detail route resolve a `slug` to a `qt_ride_id` in one lookup. - **WAL mode** (`PRAGMA journal_mode = WAL`) enables concurrent reads while the scraper writes. - **Migration strategy**: New columns are added via `ALTER TABLE ... ADD COLUMN` wrapped in try/catch. If the column already exists, the error is silently caught. This allows the schema to evolve without a migration framework. +- **Sample volume**: Tier-5 writes one row per open ride every 5 minutes during park hours. A park with 50 rides operating for 10 hours generates ~6,000 sample rows/day. `INSERT OR IGNORE` on the PK makes the sampler idempotent across retries. ### Key Queries @@ -278,14 +338,24 @@ CREATE TABLE IF NOT EXISTS park_days ( | `getDateRange(start, end)` | Returns all parks' data for a date range. Powers the week calendar. | | `getParkMonthData(parkId, year, month)` | Returns one park's data for a month. Uses `LIKE` prefix matching on date. | | `getDayData(parkId, date)` | Returns a single day for comparison during `scrapeToday()`. | +| `getParkDayCount()` | Total rows in `park_days`. Drives the startup-scrape-when-empty check. | | `isMonthScraped(parkId, year, month, staleAfterMs)` | Checks if `MAX(scraped_at)` for a park-month is within the staleness threshold. Past months always return `true` (never re-scraped). | +| `upsertRide()` | Insert or update a row in `rides`; bumps `last_seen` on every observation. | +| `getRideBySlug(parkId, slug)` | Resolves a URL slug back to a canonical ride record via `idx_rides_slug`. | +| `insertSample()` | `INSERT OR IGNORE` a sample into `ride_wait_samples` — idempotent on retries. | +| `getRideSamplesForDay()` | Returns all samples for one ride on one local date (powers the Today chart). | +| `getRideDailyAggregates()` | Per-day avg/max wait, avg/max Fast Lane, uptime %, and sample count over a window (powers the 7d / 30d charts). | +| `countRideDays()` | Number of distinct `local_date` values for a ride in a window — used to decide whether 7d/30d tabs have enough data to render. | | `transact(fn)` | Wraps a function in a SQLite transaction for atomicity. | ### Storage - **Location**: `backend/data/parks.db` (or `/app/backend/data/parks.db` in Docker) - **WAL journal files**: `parks.db-wal` and `parks.db-shm` accompany the main database -- **Size**: Approximately 8,000-9,000 rows for a full year of 24 parks +- **Size**: + - `park_days`: ~8,000-9,000 rows for a full year of 24 parks + - `rides`: ~1,000-1,500 rows total (a few dozen per park) + - `ride_wait_samples`: grows daily during operating season; expect tens of thousands of rows per active day. Historical samples are retained — no automatic pruning is configured. - **Not committed to git**: Listed in `.gitignore` - **Auto-created**: The database and `data/` directory are created on first backend startup @@ -335,6 +405,21 @@ interface ApiDay { - Handles buyouts: if `isBuyout` is true and it's not a passholder preview, the park is considered closed - Returns `{ date, isOpen, hoursLabel, specialType }` +### Six Flags Wait-Times API + +Powers the Fast Lane wait number shown alongside the regular wait. Used by `lib/scrapers/sixflags-waittimes.ts` (`fetchFastLaneWaits`, `lookupFastLane`) and joined onto the Queue-Times rides by fuzzy name match. + +| Property | Value | +|----------|-------| +| URL | `https://d18car1k0ff81h.cloudfront.net/wait-times/park/{apiId}` (sibling of the operating-hours endpoint) | +| Auth | None (spoofed browser headers, same as the operating-hours client) | +| Timeout | 10 seconds | +| Per-ride fields | `regularMinutes`, `fastLaneMinutes`, `hasFastLane` (`lookupFastLane()` return shape) | +| Error handling | Returns `null` on any failure; the route falls back to Queue-Times' regular wait | +| Backend cache | `fastLaneCache` (5-min TTL, in `services/live-cache.ts`) | + +**Why two sources?** Queue-Times wait values lag at park open by ~10-15 minutes (parks haven't reported yet). The Six Flags wait-times feed updates earlier. When both sources have a wait for the same ride, the route prefers the Six Flags regular wait; Queue-Times remains the source of truth for `isOpen`. The Fast Lane number has no Queue-Times equivalent. + ### Queue-Times.com API Provides live ride open/closed status and wait times during park operating hours. @@ -384,16 +469,20 @@ Rides are classified as roller coasters using static data from the Roller Coaste |-----------|------|------| | `app/page.tsx` | Server | Fetches week data from backend, passes to HomePageClient | | `app/park/[id]/page.tsx` | Server | Fetches month + rides data in parallel | +| `app/park/[id]/ride/[slug]/page.tsx` | Server | Fetches ride detail + today/7d/30d history in one call | | `HomePageClient` | **Client** | State management, auto-refresh, keyboard nav, localStorage | | `WeekCalendar` | Server | Desktop 7-column table layout | | `MobileCardList` | Server | Mobile card layout | | `ParkCard` | Server | Individual park card for mobile | | `ParkMonthCalendar` | Server | Month calendar grid | -| `LiveRidePanel` | **Client** | Live ride list with coaster filter toggle | +| `LiveRidePanel` | **Client** | Live ride list with coaster filter + Fast Lane toggle | | `WeekNav` | **Client** | Week navigation with arrow buttons | | `Legend` | Server | Status color legend | | `EmptyState` | Server | Empty database message | | `BackToCalendarLink` | **Client** | "Back" link using localStorage for last week | +| `charts/WaitTimeTodayChart` | **Client** | Today's 5-min wait samples + outage shading (Recharts) | +| `charts/WeeklyStatsChart` | **Client** | 7d / 30d daily aggregates chart (Recharts) | +| `charts/UptimePill` | **Client** | Compact uptime % badge | ### Component Hierarchy @@ -410,6 +499,12 @@ park/[id]/page.tsx (Server) ├── BackToCalendarLink (Client) ├── ParkMonthCalendar (Server) └── LiveRidePanel (Client) ........... or RideList (Server, inline) + +park/[id]/ride/[slug]/page.tsx (Server) + ├── BackToCalendarLink (Client) + ├── UptimePill (Client) + ├── WaitTimeTodayChart (Client) ...... Today tab + └── WeeklyStatsChart (Client) ........ 7d / 30d tabs ``` ### Client-Side Refresh @@ -496,4 +591,7 @@ interface Park { | Non-root containers | Both Docker images run as `nextjs` user (UID 1001) | | Backend-owned data | Frontend never contacts external APIs or the database directly | | CORS | Backend enables CORS middleware (currently unrestricted) | +| Per-IP rate limit | `RATE_LIMIT_PER_MIN` (default 60) — fixed-window per-IP counter in `backend/src/middleware/rate-limit.ts`. Honours `x-forwarded-for`/`x-real-ip` so a reverse proxy doesn't collapse every client to one bucket. Over-limit requests return `429` with a `Retry-After` header. | +| Env validation | `backend/src/config.ts` parses + validates env vars at startup; misconfiguration fails fast rather than surfacing in a request handler. | +| Graceful shutdown | Backend listens for `SIGTERM`/`SIGINT`, closes the HTTP server and SQLite handle before exiting (force-exit timeout as a safety net). | | No secrets in frontend | `BACKEND_URL` is an internal Docker network address, not a secret | diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md index 39c819e..063a1b6 100644 --- a/docs/DEVELOPMENT.md +++ b/docs/DEVELOPMENT.md @@ -64,14 +64,15 @@ This starts the Next.js dev server on port 3000 with hot reload. Open [http://lo ### `app/` -- Next.js Pages -Two routes: +Three routes: - `/` (`app/page.tsx`) -- Home page. Server component that fetches week data from the backend and passes everything to `HomePageClient`. -- `/park/[id]` (`app/park/[id]/page.tsx`) -- Park detail page. Fetches month calendar and live rides in parallel via `Promise.all`. +- `/park/[id]` (`app/park/[id]/page.tsx`) -- Park detail page. Fetches month calendar and live rides in parallel via `Promise.all`. Live rides use `apiFetch({ noStore: true })` to bypass the Next.js Data Cache. +- `/park/[id]/ride/[slug]` (`app/park/[id]/ride/[slug]/page.tsx`) -- Per-ride detail page with Today / 7d / 30d wait-time history. All three tabs render from a single backend response (no client-side range fetches). + +Top-level boundaries: `app/error.tsx` (root error UI), `app/not-found.tsx`, `app/park/[id]/error.tsx`, and `app/loading.tsx` (streaming skeleton). ### `components/` -- React Components -10 components, split between server and client: - | Component | Type | Purpose | |-----------|------|---------| | `HomePageClient` | Client | Top-level state: coaster filter, auto-refresh, keyboard nav | @@ -79,11 +80,14 @@ Two routes: | `MobileCardList` | Server | Mobile card layout (below `lg` breakpoint) | | `ParkCard` | Server | Individual park card for mobile | | `ParkMonthCalendar` | Server | Month grid for park detail page | -| `LiveRidePanel` | Client | Live ride list with coaster toggle and wait times | +| `LiveRidePanel` | Client | Live ride list with coaster toggle, Fast Lane toggle, wait times | | `WeekNav` | Client | Week navigation arrows | | `Legend` | Server | Color legend for status indicators | | `EmptyState` | Server | Empty database message | | `BackToCalendarLink` | Client | Back link using localStorage for last week | +| `charts/WaitTimeTodayChart` | Client | Today's 5-min wait samples with outage shading (Recharts) | +| `charts/WeeklyStatsChart` | Client | 7d / 30d daily aggregate chart (Recharts) | +| `charts/UptimePill` | Client | Compact uptime % badge | ### `lib/` -- Shared Code @@ -97,24 +101,36 @@ Imported by both frontend and backend: | `coaster-data.ts` | Static RCDB coaster name sets per park, `getCoasterSet()` | | `coaster-match.ts` | `normalizeForMatch()`, `isCoasterMatch()` -- fuzzy name matching | | `queue-times-map.ts` | `QUEUE_TIMES_IDS` -- park ID to Queue-Times park ID mapping | -| `scrapers/sixflags.ts` | Six Flags CloudFront API client -- `scrapeMonth()`, `fetchToday()`, `scrapeRidesForDay()`, rate limiting | +| `api.ts` | `apiFetch()` -- typed fetch helper with `revalidate` or `noStore` option | +| `outage.ts` | `computeOutages()` -- detects contiguous closed-during-hours runs for the today chart | +| `ride-slug.ts` | `slugifyRideName()` -- URL slug used by `/park/[id]/ride/[slug]` and the `rides` table | +| `timezone.ts` | `formatLocalDate()`, `formatLocalTime()` for bucketing samples in a park's IANA tz | +| `scrapers/sixflags.ts` | Six Flags CloudFront operating-hours client -- `scrapeMonth()`, `fetchToday()`, `scrapeRidesForDay()`, rate limiting | +| `scrapers/sixflags-waittimes.ts` | Six Flags Fast Lane wait-times client -- `fetchFastLaneWaits()`, `lookupFastLane()` | | `scrapers/queuetimes.ts` | Queue-Times.com API client -- `fetchLiveRides()` | +| `scrapers/log.ts` | Shared scraper logger (used by both `sixflags.ts` and `sixflags-waittimes.ts`) | | `scrapers/types.ts` | `Park`, `DayStatus`, `MonthCalendar`, `ScraperAdapter` interfaces | ### `backend/src/` -- Hono API Server | File | Purpose | |------|---------| -| `index.ts` | Entry point -- middleware (CORS, logger), route registration, DB init, scheduler start | -| `db/index.ts` | SQLite connection singleton, schema creation, WAL mode | -| `db/queries.ts` | All SQL queries -- `upsertDay`, `getDateRange`, `getParkMonthData`, `isMonthScraped`, etc. | +| `index.ts` | Entry point -- middleware (request log, CORS, rate limit), route registration, DB init, scheduler start, graceful shutdown | +| `config.ts` | Env-validated config object (`PORT`, `RATE_LIMIT_PER_MIN`, `PARK_HOURS_STALENESS_HOURS`, `NODE_ENV`). Fails fast on bad input. | +| `log.ts` | Structured logger -- emits `[ISO] [LEVEL] [tag] msg key=value` lines. No external dep. | +| `db/index.ts` | SQLite connection singleton, schema for `park_days` / `rides` / `ride_wait_samples`, WAL mode | +| `db/queries.ts` | All SQL queries -- `upsertDay`, `getDateRange`, `isMonthScraped`, `upsertRide`, `getRideBySlug`, `insertSample`, `getRideSamplesForDay`, `getRideDailyAggregates`, `countRideDays`, `getParkDayCount`, `transact` | +| `middleware/rate-limit.ts` | Fixed-window per-IP limiter. Honours `x-forwarded-for` / `x-real-ip`. Returns 429 with `Retry-After`. | | `routes/calendar.ts` | `/api/calendar/*` -- week and month data with live today merging | | `routes/parks.ts` | `/api/parks/*` -- park metadata | -| `routes/rides.ts` | `/api/parks/:id/rides` -- live ride status with schedule fallback | +| `routes/rides.ts` | `/api/parks/:id/rides` -- live ride status + Fast Lane join + schedule fallback | +| `routes/ride-history.ts` | `/api/parks/:id/rides/:slug` -- ride detail + today/7d/30d history in one payload | | `routes/status.ts` | `/api/status` -- health check | | `routes/scrape.ts` | `/api/scrape/trigger` -- manual scrape | -| `services/scheduler.ts` | Four-tier cron job registration | +| `services/scheduler.ts` | Five-tier cron registration with per-tier `withLatch` concurrency guards; startup-scrape-when-empty check | | `services/scraper.ts` | Scraping orchestration -- `scrapeToday()`, `scrapeMonths()`, `scrapeFullYear()` | +| `services/wait-sampler.ts` | Tier-5 5-minute sampler -- joins Queue-Times + Fast Lane, writes `ride_wait_samples`, skips weather-delayed parks | +| `services/live-cache.ts` | Shared `TtlCache` instances (`liveRidesCache`, `fastLaneCache`) so the rides route, the ride-history route, and the Tier-5 sampler share warmed upstream data | | `services/cache.ts` | Generic `TtlCache` class with configurable TTL | --- @@ -204,19 +220,35 @@ This fetches the raw Six Flags API response for the park and date, displays the ## Testing +Frontend and backend each have their own test suite, both using the Node built-in test runner. + +### Frontend tests + ```bash npm test ``` -Uses the **Node.js built-in test runner** (`node --test`). Test files live in `tests/`. +Test files live in `tests/`: -**Current test coverage:** +| File | Coverage | +|------|----------| +| `tests/coaster-matching.test.ts` | `isCoasterMatch()` — exact, prefix, compact, conjunction rejection | +| `tests/fast-lane-matching.test.ts` | `lookupFastLane()` — name normalization and Fast Lane join logic | +| `tests/outage-detection.test.ts` | `computeOutages()` — contiguous-closed-run detection for the today chart | +| `tests/ride-slug.test.ts` | `slugifyRideName()` — URL slug generation and stability | +| `tests/timezone-bucketing.test.ts` | `formatLocalDate()` / `formatLocalTime()` — DST-safe park-tz bucketing | -| File | Tests | Coverage | -|------|-------|---------| -| `tests/coaster-matching.test.ts` | 13 cases | Coaster name matching: exact, prefix, compact, conjunction rejection | +### Backend tests -Tests verify the `isCoasterMatch()` function handles edge cases like trademark symbols, possessives, subtitles, space-split brand words, and conjunction-joined compound ride names. +```bash +cd backend && npm test +``` + +Test files live in `backend/tests/`: + +| File | Coverage | +|------|----------| +| `backend/tests/wait-aggregation.test.ts` | SQL aggregation in `getRideDailyAggregates()` — averages, max, uptime, sample count | --- diff --git a/docs/OPERATIONS.md b/docs/OPERATIONS.md index ac55552..b5cfd67 100644 --- a/docs/OPERATIONS.md +++ b/docs/OPERATIONS.md @@ -116,9 +116,12 @@ volumes: |----------|---------|-------------| | `TZ` | `UTC` | Process timezone. Controls when cron jobs fire. Set to `America/New_York` in production so schedules align with US Eastern parks. | | `PARK_HOURS_STALENESS_HOURS` | `72` | Hours before park schedule data is considered stale and re-fetched. Lower values increase API load; higher values increase data lag. | +| `RATE_LIMIT_PER_MIN` | `60` | Per-IP request limit for the public API. Over-limit requests return `429 Too Many Requests` with a `Retry-After` header. Enforced by `backend/src/middleware/rate-limit.ts`. Behind a proxy, ensure `x-forwarded-for` is set or every client looks like the proxy IP. | | `NODE_ENV` | -- | Set to `production` in Docker. | | `PORT` | `3001` | Server listen port. | +`backend/src/config.ts` parses and validates these at startup. A bad value (e.g. `PORT=foo`) fails fast with a thrown `Error` rather than surfacing in a request handler later. + --- ## CI/CD Pipeline @@ -167,9 +170,10 @@ These are configured in the Gitea repository settings under **Settings > Actions 3. **Verify the backend started:** ```bash docker compose logs backend - # Look for: [backend] database initialized - # [scheduler] cron jobs registered - # [backend] listening on http://localhost:3001 + # Look for (structured log lines, see the Log Reference section): + # [INFO] [startup] database initialized + # [INFO] [scheduler] cron jobs registered ... + # [INFO] [startup] listening url=http://localhost:3001 ``` 4. **Check database status (will be empty on first run):** @@ -251,7 +255,7 @@ Backups are recommended for continuity (avoiding the 5-10 minute re-scrape windo ### Tiered Cron Schedule -The backend runs four scraping tiers via `node-cron`: +The backend runs five scraping tiers via `node-cron`: | Tier | Cron Expression | Schedule | Scope | Delay | |------|-----------------|----------|-------|-------| @@ -259,10 +263,24 @@ The backend runs four scraping tiers via `node-cron`: | 2 | `0 */6 * * *` | Every 6 hours | Current month for all parks | 1000ms | | 3 | `0 3,15 * * *` | 3 AM and 3 PM | Current + next month | 1000ms | | 4 | `0 3 * * *` | Daily at 3 AM | Full year (all 12 months) | 1000ms | +| 5 | `*/5 * * * *` | Every 5 minutes | Wait-time samples for currently-open parks into `ride_wait_samples` | parallel chunks of 6 | -**Staleness:** Tiers 2-4 skip any park-month that was scraped within `PARK_HOURS_STALENESS_HOURS` (default 72h). Tier 1 always fetches (uses diff-before-write instead). +**Staleness:** Tiers 2-4 skip any park-month that was scraped within `PARK_HOURS_STALENESS_HOURS` (default 72h). Tier 1 always fetches (uses diff-before-write instead). Tier 5 only samples parks whose `park_days` row marks them open today *and* whose current local time is inside the operating window (with a 1-hour closing buffer). -**Off-season:** Tier 1 only runs from March through December. The month constraint `3-12` in the cron expression skips January and February when most parks are closed. +**Off-season:** Tier 1 only runs from March through December. The month constraint `3-12` in the cron expression skips January and February when most parks are closed. Tier 5 runs year-round but is effectively a no-op when no parks are open. + +**Concurrency latches:** Every tier is wrapped in `withLatch()` (see `backend/src/services/scheduler.ts`). If a tick is still running when the next would fire, the new tick is *skipped* and logged with a `previous run still in progress` warning rather than stacking. Each tier has its own latch so a slow Tier-4 doesn't block Tier-5's 5-minute cadence. + +**Weather-delayed parks skipped from sampling:** Tier 5 detects the "rides exist but all closed during scheduled hours" case and skips writes for that park, so a storm doesn't poison the uptime statistics with hours of `is_open=0` samples. + +### Startup Behavior + +On boot, the scheduler checks `getParkDayCount()` against a threshold of 50 rows: + +- **Empty / nearly-empty database** (< 50 rows): runs `scrapeToday()` followed by `scrapeFullYear()` in sequence. Logs `[scheduler.startup]` lines for each phase. +- **Populated database** (≥ 50 rows): skips the startup scrape and relies on cron tiers. Logs `skipping startup scrape — relying on cron`. + +This replaces the earlier behavior of full-scraping on every container start, which doubled outbound API load and delayed readiness on every deploy. ### Timezone Sensitivity @@ -374,7 +392,7 @@ curl http://localhost:3001/api/status ```bash docker compose logs backend --tail 50 ``` - Look for `[backend] listening on http://localhost:3001`. + Look for an `[INFO] [startup] listening url=http://localhost:3001` line. 2. **Check if the database has data:** ```bash @@ -452,28 +470,39 @@ If the database becomes corrupted (unlikely with SQLite WAL mode, but possible a ## Log Reference -| Prefix | Source | Meaning | -|--------|--------|---------| -| `[backend]` | `index.ts` | Startup messages: DB initialized, server listening | -| `[scheduler]` | `scheduler.ts` | Cron job triggers with tier number | -| `[today]` | `scraper.ts` | Per-park results for the today tier (updated/skipped/error) | -| `[month]` | `scraper.ts` | Per-park-month results (open days count, rate limited, errors) | -| `[rate-limited]` | `sixflags.ts` | HTTP 429/503 with backoff timing and retry attempt count | +The backend uses a small structured logger (`backend/src/log.ts`). Every line has the format: + +``` + [] [] key1=value1 key2=value2 … +``` + +Levels are `INFO`, `WARN`, `ERROR`. `ERROR` writes to stderr; the others write to stdout. Grep-friendly: filter by tag (`grep '\[scheduler.tier1\]'`) or by key (`grep 'park=cedarpoint'`). + +| Tag | Source | Meaning | +|-----|--------|---------| +| `startup` | `index.ts` | Config loaded, DB initialized, server listening | +| `shutdown` | `index.ts` | `SIGTERM`/`SIGINT` received; graceful shutdown progress | +| `http` | `index.ts` | One line per request: `method`, `path`, `status`, `ms` | +| `scheduler` | `scheduler.ts` | Cron job registration summary on boot | +| `scheduler.tier1` … `scheduler.tier5` | `scheduler.ts` | Each tier's tick; includes skip-due-to-latch warnings | +| `scheduler.startup` | `scheduler.ts` | Result of the "database empty" startup scrape | +| `today` / `month` | `scraper.ts` | Per-park / per-month scrape results | +| `wait-sampler` | `wait-sampler.ts` | Tier-5 per-park sample writes, errors, weather-delay skips | +| `rate-limit` | `middleware/rate-limit.ts` | `blocked` event with `ip`, `count`, `retryAfter` | +| `rides` | `routes/rides.ts` | Per-request warnings when upstream calls fail | +| `rate-limited` | `lib/scrapers/sixflags.ts` | HTTP 429/503 from Six Flags with backoff timing | **Example log output:** ``` -[backend] database initialized -[scheduler] cron jobs registered - tier-1: today — hourly (Mar-Dec) - tier-2: current month — every 6h - tier-3: upcoming — 3 AM + 3 PM - tier-4: full year — 3 AM daily -[backend] listening on http://localhost:3001 -[scheduler] tier-1: scraping today @ 2026-04-23T14:00:00.000Z -[today] Great Adventure: updated (open 10am - 6pm) -[today] Cedar Point: updated (open 10am - 8pm) -[today] done: 24 fetched, 3 updated, 0 skipped, 0 errors +2026-04-23T14:00:00.012Z [INFO] [startup] config loaded port=3001 nodeEnv=production parkHoursStalenessHours=72 rateLimitPerMin=60 +2026-04-23T14:00:00.034Z [INFO] [startup] database initialized +2026-04-23T14:00:00.041Z [INFO] [scheduler] cron jobs registered tiers="tier1=hourly(Mar-Dec) tier2=6h tier3=3am+3pm tier4=3am-daily tier5=5min" +2026-04-23T14:00:00.042Z [INFO] [scheduler] skipping startup scrape — relying on cron existingRows=8742 +2026-04-23T14:00:00.045Z [INFO] [startup] listening url=http://localhost:3001 +2026-04-23T14:00:00.123Z [INFO] [http] GET /api/calendar/week status=200 ms=18 +2026-04-23T14:00:10.001Z [INFO] [scheduler.tier1] scraping today +2026-04-23T14:05:00.001Z [INFO] [scheduler.tier5] sample run complete parksSampled=14 parksSkipped=10 samplesWritten=612 weatherDelayed=0 errors=0 ``` ---