docs: sync README and docs/ with current codebase
Build and Deploy / Lint, typecheck, test (push) Successful in 34s
Build and Deploy / Build & Push (push) Successful in 1m6s

Surfaces features that landed after the last big docs pass: per-ride
history pages, Fast Lane wait times, outage shading on the today chart,
Tier-5 wait-time sampler, production-hardening pieces (rate limiter,
structured logger, env validation, graceful shutdown), and the new
rides + ride_wait_samples tables. Also corrects the weather-delay rule
to match the "open" vs "closing" gate now in rides.ts.
This commit is contained in:
2026-06-02 15:31:50 -04:00
parent 2e9cec0b56
commit f87462385c
5 changed files with 397 additions and 72 deletions
+18 -2
View File
@@ -52,6 +52,20 @@ The park detail page shows ride open/closed status using a two-tier approach:
2. **Schedule fallback (Six Flags API)** — when Queue-Times data is unavailable, the app falls back to the nearest upcoming date from the Six Flags schedule API as an approximation. 2. **Schedule fallback (Six Flags API)** — when Queue-Times data is unavailable, the app falls back to the nearest upcoming date from the Six Flags schedule API as an approximation.
### Fast Lane wait times
A second wait number is fetched from Six Flags' `/wait-times/park/{apiId}` endpoint and joined onto each ride by name. The park page has a **Fast Lane** toggle (persisted in `localStorage.fastLaneMode`) that swaps the displayed wait between regular and Fast Lane. On the today chart, Fast Lane appears as a second line.
### Per-ride history
Click any ride name on a park page to open `/park/[id]/ride/[slug]` — a detail page with three tabs:
- **Today** — 5-minute wait-time samples (regular + Fast Lane) with outage markers
- **7 days** — daily average / max wait and uptime percentage
- **30 days** — same aggregates over a longer window
Samples are stored in the `ride_wait_samples` table by a Tier-5 cron job that runs every 5 minutes for parks currently within their operating window. Contiguous "ride closed during park hours" runs are shaded on the today chart with a `#N — Hh Mm` label.
### Roller Coaster Filter ### Roller Coaster Filter
When live data is shown, a **Coasters only** toggle filters to roller coasters. Coaster lists are hardcoded in `lib/coaster-data.ts`. When live data is shown, a **Coasters only** toggle filters to roller coasters. Coaster lists are hardcoded in `lib/coaster-data.ts`.
@@ -66,8 +80,9 @@ The backend runs a tiered scraping schedule via node-cron:
| 2 | Every 6 hours | Current month for all parks | | 2 | Every 6 hours | Current month for all parks |
| 3 | Twice daily (3 AM, 3 PM) | Current + next month | | 3 | Twice daily (3 AM, 3 PM) | Current + next month |
| 4 | Daily at 3 AM | Full year (respects 72h staleness window) | | 4 | Daily at 3 AM | Full year (respects 72h staleness window) |
| 5 | Every 5 minutes | Wait-time samples for all currently-open parks (writes `ride_wait_samples`) |
Past dates are never overwritten. The hourly tier compares live data against the database before writing — unchanged data is skipped. Past dates are never overwritten. The hourly tier compares live data against the database before writing — unchanged data is skipped. Each tier has its own concurrency latch — if a tick is still running when the next would fire, the new tick is skipped and logged rather than stacked.
A manual trigger is available via the backend API: A manual trigger is available via the backend API:
@@ -152,7 +167,7 @@ See [`.env.example`](.env.example) for the full list and defaults.
| `PORT` | `3001` | Port the Hono server listens on. | | `PORT` | `3001` | Port the Hono server listens on. |
| `TZ` | `UTC` | Timezone for cron schedules (e.g. `America/New_York`). | | `TZ` | `UTC` | Timezone for cron schedules (e.g. `America/New_York`). |
| `PARK_HOURS_STALENESS_HOURS` | `72` | Hours before park schedule data is re-fetched. | | `PARK_HOURS_STALENESS_HOURS` | `72` | Hours before park schedule data is re-fetched. |
| `RATE_LIMIT_PER_MIN` | `60` | Per-IP request limit for the public API, per minute. | | `RATE_LIMIT_PER_MIN` | `60` | Per-IP request limit for the public API, per minute. Enforced by `backend/src/middleware/rate-limit.ts`; over-limit requests get a `429` with a `Retry-After` header. |
### Updating ### Updating
@@ -167,6 +182,7 @@ docker compose pull && docker compose up -d
| `GET /api/calendar/week?start=YYYY-MM-DD` | Week calendar for all parks | | `GET /api/calendar/week?start=YYYY-MM-DD` | Week calendar for all parks |
| `GET /api/calendar/:parkId/month?month=YYYY-MM` | Month calendar for one park | | `GET /api/calendar/:parkId/month?month=YYYY-MM` | Month calendar for one park |
| `GET /api/parks/:id/rides` | Live rides or schedule fallback | | `GET /api/parks/:id/rides` | Live rides or schedule fallback |
| `GET /api/parks/:id/rides/:slug` | Per-ride detail + today/7d/30d wait-time history |
| `GET /api/parks` | Park list with metadata | | `GET /api/parks` | Park list with metadata |
| `GET /api/status` | Health check, scrape timestamps, DB stats | | `GET /api/status` | Health check, scrape timestamps, DB stats |
| `POST /api/scrape/trigger?scope=...` | Manual scrape trigger | | `POST /api/scrape/trigger?scope=...` | Manual scrape trigger |
+151 -1
View File
@@ -14,6 +14,20 @@
None. All endpoints are public and unauthenticated. The scrape trigger endpoint is also unprotected -- restrict access at the network/proxy level if needed. None. All endpoints are public and unauthenticated. The scrape trigger endpoint is also unprotected -- restrict access at the network/proxy level if needed.
## Rate limiting
Every endpoint is gated by a fixed-window per-IP counter (`backend/src/middleware/rate-limit.ts`).
| Header / Body | Value |
|---------------|-------|
| Limit | `RATE_LIMIT_PER_MIN` env var, default `60` requests/minute |
| Window | 60 seconds, per client IP (resolved via `x-forwarded-for``x-real-ip` → socket address) |
| Over-limit response | `429 Too Many Requests` |
| Body | `{ "error": "Too many requests" }` |
| Response header | `Retry-After: <seconds>` — how long until the window resets |
Behind a reverse proxy, make sure `x-forwarded-for` is set or every request will appear to come from the proxy's own IP.
--- ---
## Endpoints ## Endpoints
@@ -226,15 +240,21 @@ Returns live ride status or schedule fallback for a park.
"rides": [ "rides": [
{ {
"name": "Steel Vengeance", "name": "Steel Vengeance",
"slug": "steel-vengeance",
"isOpen": true, "isOpen": true,
"waitMinutes": 45, "waitMinutes": 45,
"fastLaneMinutes": 10,
"hasFastLane": true,
"lastUpdated": "2026-04-23T18:30:00.000Z", "lastUpdated": "2026-04-23T18:30:00.000Z",
"isCoaster": true "isCoaster": true
}, },
{ {
"name": "Millennium Force", "name": "Millennium Force",
"slug": "millennium-force",
"isOpen": false, "isOpen": false,
"waitMinutes": 0, "waitMinutes": 0,
"fastLaneMinutes": null,
"hasFastLane": true,
"lastUpdated": "2026-04-23T18:30:00.000Z", "lastUpdated": "2026-04-23T18:30:00.000Z",
"isCoaster": true "isCoaster": true
} }
@@ -245,6 +265,8 @@ Returns live ride status or schedule fallback for a park.
} }
``` ```
Each ride is enriched from two sources: Queue-Times.com supplies `isOpen` and the base `waitMinutes`, then Six Flags' wait-times feed is joined by name to fill in `fastLaneMinutes` and `hasFastLane`. When both sources have a regular wait for the same ride, the Six Flags value wins (Queue-Times lags around park open). `fastLaneMinutes` is `null` when the ride is closed or has no Fast Lane line. `slug` is the URL-safe identifier used by `/api/parks/:id/rides/:slug`.
**Response fields:** **Response fields:**
| Field | Type | Description | | Field | Type | Description |
@@ -270,6 +292,101 @@ Returns live ride status or schedule fallback for a park.
--- ---
### GET /api/parks/:parkId/rides/:slug
Returns metadata + history for a single ride: today's 5-minute wait samples and daily aggregates over the last 7 and 30 calendar days. Everything ships in one round-trip — the frontend renders the Today / 7d / 30d tabs from this single payload.
**Path Parameters:**
| Param | Description |
|-------|-------------|
| `parkId` | Park identifier (e.g. `cedarpoint`) |
| `slug` | Ride slug, as returned in `LiveRide.slug` or stored in the `rides.slug` column |
**Cache:** `Cache-Control: public, max-age=60, stale-while-revalidate=120`
**Response:**
```json
{
"park": {
"id": "cedarpoint",
"name": "Cedar Point",
"shortName": "Cedar Point",
"timezone": "America/New_York"
},
"ride": {
"qtRideId": 257,
"slug": "steel-vengeance",
"name": "Steel Vengeance",
"isCoaster": true,
"hasFastLane": true,
"firstSeen": "2026-03-15T14:05:00.000Z",
"lastSeen": "2026-04-23T18:35:00.000Z"
},
"live": {
"isOpen": true,
"waitMinutes": 45,
"hasFastLane": true,
"fastLaneMinutes": 10,
"lastUpdated": "2026-04-23T18:30:00.000Z"
},
"todayLocal": "2026-04-23",
"today": [
{
"recordedAt": "2026-04-23T14:05:12.000Z",
"localTime": "10:05",
"isOpen": true,
"waitMinutes": 15,
"fastLaneMinutes": 5
}
],
"last7d": [
{
"localDate": "2026-04-17",
"avgWait": 38.4,
"maxWait": 90,
"avgFastLane": 9.1,
"maxFastLane": 25,
"uptimePct": 0.94,
"sampleCount": 132
}
],
"last30d": [],
"coverage": {
"daysWith7d": 6,
"daysWith30d": 23,
"todaySampleCount": 1
}
}
```
**Response fields:**
| Field | Type | Description |
|-------|------|-------------|
| `park` | `{ id, name, shortName, timezone }` | Park identity (timezone is the IANA tz used for sample bucketing) |
| `ride` | `RideRecord` | Canonical row from the `rides` table |
| `live` | `LiveRideSummary \| null` | Best-effort current state pulled from the shared in-memory cache. No upstream fetch — populated by the rides route and Tier-5 sampler. `null` if no recent observation exists. |
| `todayLocal` | `string` | Today's date in the park's timezone |
| `today` | `DailySample[]` | Per-sample series for `todayLocal`, ordered by `recordedAt` |
| `last7d` | `DailyAggregate[]` | One row per `local_date` over the last 7 calendar days (inclusive of today) |
| `last30d` | `DailyAggregate[]` | Same aggregates over 30 days |
| `coverage.daysWith7d` | `number` | Distinct dates with samples in the 7-day window — use to gate the 7d tab |
| `coverage.daysWith30d` | `number` | Distinct dates with samples in the 30-day window |
| `coverage.todaySampleCount` | `number` | Number of samples already collected today |
`DailySample` and `DailyAggregate` shapes are listed under [Data Types](#data-types).
**Errors:**
| Status | Body | Condition |
|--------|------|-----------|
| 404 | `{ "error": "Park not found" }` | Unknown park ID |
| 404 | `{ "error": "Ride not found or no history yet" }` | Slug doesn't match any row in `rides` for this park (Tier-5 hasn't seen the ride yet, or the slug is wrong) |
---
### GET /api/status ### GET /api/status
Health check endpoint with database statistics. Health check endpoint with database statistics.
@@ -395,8 +512,11 @@ A single ride from the Queue-Times.com API.
```typescript ```typescript
interface LiveRide { interface LiveRide {
name: string; // Ride display name name: string; // Ride display name
slug: string; // URL-safe slug for /api/parks/:id/rides/:slug
isOpen: boolean; // Currently operating isOpen: boolean; // Currently operating
waitMinutes: number; // Current wait time (0 if closed) waitMinutes: number; // Current regular wait (0 if closed)
fastLaneMinutes?: number | null; // Fast Lane wait (null when closed or no Fast Lane line)
hasFastLane?: boolean; // Ride has a Fast Lane offering per Six Flags
lastUpdated: string; // ISO 8601 timestamp from Queue-Times lastUpdated: string; // ISO 8601 timestamp from Queue-Times
isCoaster: boolean; // Classified as a roller coaster via RCDB data isCoaster: boolean; // Classified as a roller coaster via RCDB data
} }
@@ -432,6 +552,36 @@ interface RideStatus {
} }
``` ```
### DailySample
A single wait-time observation recorded by the Tier-5 sampler.
```typescript
interface DailySample {
recordedAt: string; // ISO 8601 UTC timestamp
localTime: string; // HH:MM in the park's timezone
isOpen: boolean; // Ride open at this sample
waitMinutes: number | null; // Regular wait, null when unobserved
fastLaneMinutes: number | null; // Fast Lane wait, null when no Fast Lane or unobserved
}
```
### DailyAggregate
Per-day statistics computed in SQL from `ride_wait_samples`. Only open samples contribute to wait averages.
```typescript
interface DailyAggregate {
localDate: string; // YYYY-MM-DD in the park's timezone
avgWait: number | null; // Mean wait_minutes across open samples
maxWait: number | null; // Highest wait_minutes across open samples
avgFastLane: number | null; // Mean fast_lane_minutes across open samples
maxFastLane: number | null; // Highest fast_lane_minutes across open samples
uptimePct: number; // Fraction of samples with is_open=1 (0..1)
sampleCount: number; // Total samples for the day
}
```
### ScrapeResult ### ScrapeResult
Result of a scraping operation. Result of a scraping operation.
+119 -21
View File
@@ -69,8 +69,12 @@ The web container reaches the backend via Docker internal networking (`http://ba
├── app/ # Next.js App Router ├── app/ # Next.js App Router
│ ├── page.tsx # Home page (week calendar, server component) │ ├── page.tsx # Home page (week calendar, server component)
│ ├── park/[id]/page.tsx # Park detail page (month calendar + rides) │ ├── park/[id]/page.tsx # Park detail page (month calendar + rides)
│ ├── park/[id]/error.tsx # Per-route error boundary
│ ├── park/[id]/ride/[slug]/page.tsx # Ride detail + history page
│ ├── layout.tsx # Root layout with metadata │ ├── layout.tsx # Root layout with metadata
│ ├── loading.tsx # Skeleton UI for streaming/suspense │ ├── loading.tsx # Skeleton UI for streaming/suspense
│ ├── error.tsx # Top-level error boundary (client)
│ ├── not-found.tsx # 404 page
│ └── globals.css # Tailwind v4 theme + custom CSS variables │ └── globals.css # Tailwind v4 theme + custom CSS variables
├── components/ # React components ├── components/ # React components
@@ -79,11 +83,15 @@ The web container reaches the backend via Docker internal networking (`http://ba
│ ├── MobileCardList.tsx # Mobile card layout │ ├── MobileCardList.tsx # Mobile card layout
│ ├── ParkCard.tsx # Individual park card │ ├── ParkCard.tsx # Individual park card
│ ├── ParkMonthCalendar.tsx # Month grid for park detail page │ ├── ParkMonthCalendar.tsx # Month grid for park detail page
│ ├── LiveRidePanel.tsx # Live ride status with wait times (client) │ ├── LiveRidePanel.tsx # Live ride status with wait times + Fast Lane toggle (client)
│ ├── WeekNav.tsx # Week navigation arrows (client) │ ├── WeekNav.tsx # Week navigation arrows (client)
│ ├── Legend.tsx # Status color legend │ ├── Legend.tsx # Status color legend
│ ├── EmptyState.tsx # Shown when no data is scraped │ ├── EmptyState.tsx # Shown when no data is scraped
── BackToCalendarLink.tsx # Navigation helper (client) ── BackToCalendarLink.tsx # Navigation helper (client)
│ └── charts/ # Recharts-based charts (client components)
│ ├── WaitTimeTodayChart.tsx # Today's 5-min samples with outage shading
│ ├── WeeklyStatsChart.tsx # 7d / 30d daily aggregates
│ └── UptimePill.tsx # Compact uptime % indicator
├── lib/ # Shared code (imported by both frontend and backend) ├── lib/ # Shared code (imported by both frontend and backend)
│ ├── types.ts # Core DayData interface │ ├── types.ts # Core DayData interface
@@ -92,32 +100,46 @@ The web container reaches the backend via Docker internal networking (`http://ba
│ ├── coaster-data.ts # Static RCDB coaster name sets per park │ ├── coaster-data.ts # Static RCDB coaster name sets per park
│ ├── coaster-match.ts # Fuzzy name matching (normalize, prefix, compact) │ ├── coaster-match.ts # Fuzzy name matching (normalize, prefix, compact)
│ ├── queue-times-map.ts # Park ID -> Queue-Times.com park ID mapping │ ├── queue-times-map.ts # Park ID -> Queue-Times.com park ID mapping
│ ├── api.ts # apiFetch() helper (revalidate vs. no-store option)
│ ├── outage.ts # computeOutages() — contiguous-closed-run detection
│ ├── ride-slug.ts # slugifyRideName() — URL slug for ride pages
│ ├── timezone.ts # formatLocalDate / formatLocalTime in a park's tz
│ └── scrapers/ │ └── scrapers/
│ ├── sixflags.ts # Six Flags CloudFront API client │ ├── sixflags.ts # Six Flags CloudFront operating-hours client
│ ├── sixflags-waittimes.ts # Six Flags Fast Lane wait-times client
│ ├── queuetimes.ts # Queue-Times.com API client │ ├── queuetimes.ts # Queue-Times.com API client
│ ├── log.ts # Shared scraper logger
│ └── types.ts # Park, DayStatus, MonthCalendar, ScraperAdapter interfaces │ └── types.ts # Park, DayStatus, MonthCalendar, ScraperAdapter interfaces
├── backend/ # Hono API server (separate package.json) ├── backend/ # Hono API server (separate package.json)
│ ├── src/ │ ├── src/
│ │ ├── index.ts # Entry point: middleware, routes, DB init, scheduler start │ │ ├── index.ts # Entry point: middleware, routes, DB init, scheduler start, graceful shutdown
│ │ ├── config.ts # Env-validated config object (fails fast on bad input)
│ │ ├── log.ts # Structured logger (`[ISO] [LEVEL] [tag] msg key=value`)
│ │ ├── db/ │ │ ├── db/
│ │ │ ├── index.ts # SQLite connection, schema creation, WAL mode │ │ │ ├── index.ts # SQLite connection, schema for park_days / rides / ride_wait_samples, WAL mode
│ │ │ └── queries.ts # All SQL queries (upsert, date range, staleness) │ │ │ └── queries.ts # All SQL queries (upsert, date range, staleness, samples, aggregates)
│ │ ├── middleware/
│ │ │ └── rate-limit.ts # Fixed-window per-IP limiter (honours x-forwarded-for)
│ │ ├── routes/ │ │ ├── routes/
│ │ │ ├── calendar.ts # /api/calendar/* -- week and month data with live merging │ │ │ ├── calendar.ts # /api/calendar/* -- week and month data with live merging
│ │ │ ├── parks.ts # /api/parks/* -- park metadata │ │ │ ├── parks.ts # /api/parks/* -- park metadata
│ │ │ ├── rides.ts # /api/parks/:id/rides -- live rides + schedule fallback │ │ │ ├── rides.ts # /api/parks/:id/rides -- live rides + Fast Lane + schedule fallback
│ │ │ ├── ride-history.ts # /api/parks/:id/rides/:slug -- ride detail + today/7d/30d history
│ │ │ ├── status.ts # /api/status -- health check │ │ │ ├── status.ts # /api/status -- health check
│ │ │ └── scrape.ts # /api/scrape/trigger -- manual scrape │ │ │ └── scrape.ts # /api/scrape/trigger -- manual scrape
│ │ └── services/ │ │ └── services/
│ │ ├── scheduler.ts # Four-tier cron job registration │ │ ├── scheduler.ts # Five-tier cron jobs with per-tier concurrency latches
│ │ ├── scraper.ts # Scraping orchestration (today, month, full year) │ │ ├── scraper.ts # Scraping orchestration (today, month, full year)
│ │ ├── wait-sampler.ts # Tier-5: 5-min wait-time sampling into ride_wait_samples
│ │ ├── live-cache.ts # Shared TtlCaches (liveRidesCache, fastLaneCache, todayCache)
│ │ └── cache.ts # Generic TtlCache<T> class │ │ └── cache.ts # Generic TtlCache<T> class
│ ├── tests/ # Backend Node test runner suite
│ ├── data/ # SQLite database (parks.db, auto-created) │ ├── data/ # SQLite database (parks.db, auto-created)
│ ├── package.json # Backend dependencies │ ├── package.json # Backend dependencies
│ └── tsconfig.json # Backend TypeScript config (CommonJS, rootDir: ..) │ └── tsconfig.json # Backend TypeScript config (CommonJS, rootDir: ..)
├── tests/ # Unit tests (Node built-in test runner) ├── tests/ # Frontend unit tests (Node built-in test runner)
├── scripts/ # Debug utility ├── scripts/ # Debug utility
├── public/ # Static assets ├── public/ # Static assets
├── Dockerfile # Multi-stage build (web + backend targets) ├── Dockerfile # Multi-stage build (web + backend targets)
@@ -212,9 +234,9 @@ When the requested week includes today, the `/api/calendar/week` route enhances
2. **Live ride counts** -- For each park that is currently within its operating window (determined by `isWithinOperatingWindow()`), fetches live ride data from Queue-Times.com via `fetchLiveRides()`. Counts open rides and open coasters. Results cached in `ridesCache` (5-min TTL). 2. **Live ride counts** -- For each park that is currently within its operating window (determined by `isWithinOperatingWindow()`), fetches live ride data from Queue-Times.com via `fetchLiveRides()`. Counts open rides and open coasters. Results cached in `ridesCache` (5-min TTL).
3. **Status detection:** 3. **Status detection:**
- **Weather delay**: Park is within its scheduled operating window, but _all_ rides report `isOpen: false`. Indicated with a blue badge. - **Open**: Within the scheduled open-to-close window. `getOperatingStatus()` returns `"open"`.
- **Closing**: Current time is past the scheduled close but within a 1-hour wind-down buffer. Determined by `getOperatingStatus()` returning `"closing"`. - **Closing**: Current time is past the scheduled close but within a 1-hour wind-down buffer. `getOperatingStatus()` returns `"closing"`.
- **Open**: Within the scheduled open-to-close window. - **Weather delay**: `getOperatingStatus()` is `"open"` _and_ every reported ride has `isOpen: false`. Indicated with a blue badge. The badge is intentionally suppressed during the `"closing"` wind-down — all-rides-closed near close is normal end-of-day behavior, not weather. Logic lives at [backend/src/routes/rides.ts:96-100](../backend/src/routes/rides.ts) and [backend/src/routes/calendar.ts](../backend/src/routes/calendar.ts).
The 3 AM switchover in `getTodayLocal()` prevents the calendar from flipping to the next day at midnight -- before 3 AM local time, the system still considers it "yesterday", since park visitors may still be out. The 3 AM switchover in `getTodayLocal()` prevents the calendar from flipping to the next day at midnight -- before 3 AM local time, the system still considers it "yesterday", since park visitors may still be out.
@@ -228,16 +250,19 @@ The system uses three layers of caching, each serving a different purpose:
Layer 1: Next.js ISR Layer 2: Backend In-Memory Layer 3: Database Staleness Layer 1: Next.js ISR Layer 2: Backend In-Memory Layer 3: Database Staleness
(serves stale while revalidating) (prevents redundant API calls) (controls scrape frequency) (serves stale while revalidating) (prevents redundant API calls) (controls scrape frequency)
┌───────────────────────────────┐ ┌───────────────────────────────┐ ┌───────────────────────────────┐ ┌───────────────────────────────┐ ┌───────────────────────────────┐ ┌───────────────────────────────┐
│ Cache-Control response headers│ │ TtlCache<T> (5 min default) │ │ isMonthScraped() query │ │ Cache-Control response headers│ │ TtlCache<T> (5 min TTL) │ │ isMonthScraped() query │
│ + Next.js fetch revalidate │ │ │ │ MAX(scraped_at) vs staleness │ │ + Next.js fetch revalidate │ │ │ │ MAX(scraped_at) vs staleness │
│ │ │ todayCache: live park hours │ │ threshold (default 72h) │ │ │ │ todayCache: routes/calendar│ │ threshold (default 72h) │
│ week: 120s / 300s SWR │ │ ridesCache: ride/coaster │ │ │ │ week: 120s / 300s SWR │ │ (live park hours per park) │ │ │
│ month: 300s / 600s SWR │ │ open counts │ │ Past months auto-skipped │ │ month: 300s / 600s SWR │ │ liveRidesCache, fastLaneCache:│ │ Past months auto-skipped │
│ rides: 60s / 120s SWR │ │ liveRidesCache: full ride │ │ "force" scope bypasses check │ │ rides: 60s / 120s SWR │ │ services/live-cache.ts — │ │ "force" scope bypasses check │
│ parks: 3600s │ │ data per park │ │ │ │ parks: 3600s │ │ shared by rides routes + │ │ │
│ │ │ the Tier-5 sampler │ │ │
└───────────────────────────────┘ └───────────────────────────────┘ └───────────────────────────────┘ └───────────────────────────────┘ └───────────────────────────────┘ └───────────────────────────────┘
``` ```
**Live-page Data Cache bypass.** The park detail page (`app/park/[id]/page.tsx`) and ride detail page (`app/park/[id]/ride/[slug]/page.tsx`) fetch their live ride data with `cache: "no-store"` via [`apiFetch`](../lib/api.ts) (`{ noStore: true }`). Earlier revisions used Next.js ISR for these too, but the Data Cache served stale ride state after idle periods — navigation back to a park would show ride statuses from hours ago. Backend HTTP cache headers still allow the upstream Hono server to return cached responses for 60s, so this is a "skip the Next.js Data Cache" change, not a "skip all caching" change. The home calendar page keeps its ISR revalidation since its data is intrinsically slower-moving.
**Per-route HTTP cache headers:** **Per-route HTTP cache headers:**
| Endpoint | `max-age` | `stale-while-revalidate` | | Endpoint | `max-age` | `stale-while-revalidate` |
@@ -247,6 +272,7 @@ The system uses three layers of caching, each serving a different purpose:
| `/api/parks` | 3600s | -- | | `/api/parks` | 3600s | -- |
| `/api/parks/:id` | 3600s | -- | | `/api/parks/:id` | 3600s | -- |
| `/api/parks/:id/rides` | 60s | 120s | | `/api/parks/:id/rides` | 60s | 120s |
| `/api/parks/:id/rides/:slug` | 60s | 120s |
--- ---
@@ -254,7 +280,10 @@ The system uses three layers of caching, each serving a different purpose:
### Schema ### Schema
The database has three tables: `park_days` (calendar hours), `rides` (per-ride metadata), and `ride_wait_samples` (time-series wait data).
```sql ```sql
-- Park operating hours, keyed by park and date.
CREATE TABLE IF NOT EXISTS park_days ( CREATE TABLE IF NOT EXISTS park_days (
park_id TEXT NOT NULL, -- matches Park.id from lib/parks.ts (e.g. "cedarpoint") park_id TEXT NOT NULL, -- matches Park.id from lib/parks.ts (e.g. "cedarpoint")
date TEXT NOT NULL, -- ISO date: YYYY-MM-DD date TEXT NOT NULL, -- ISO date: YYYY-MM-DD
@@ -264,11 +293,42 @@ CREATE TABLE IF NOT EXISTS park_days (
scraped_at TEXT NOT NULL, -- ISO timestamp of when this row was written scraped_at TEXT NOT NULL, -- ISO timestamp of when this row was written
PRIMARY KEY (park_id, date) PRIMARY KEY (park_id, date)
); );
-- Per-ride canonical record. PK is (park_id, qt_ride_id) so ride renames
-- don't fragment history — the slug just provides pretty URLs.
CREATE TABLE IF NOT EXISTS rides (
park_id TEXT NOT NULL,
qt_ride_id INTEGER NOT NULL, -- Queue-Times ride ID (stable upstream)
slug TEXT NOT NULL, -- URL slug (rebuilt if name changes)
name TEXT NOT NULL, -- Display name as last seen
is_coaster INTEGER NOT NULL DEFAULT 0,
has_fast_lane INTEGER NOT NULL DEFAULT 0,
first_seen TEXT NOT NULL,
last_seen TEXT NOT NULL,
PRIMARY KEY (park_id, qt_ride_id)
);
CREATE UNIQUE INDEX IF NOT EXISTS idx_rides_slug ON rides (park_id, slug);
-- Time-series wait samples written by Tier-5 every 5 minutes for currently
-- open parks. `recorded_at` is UTC; `local_date` / `local_time` are bucketed
-- in the park's IANA timezone at insert time so reads are pure SQL and DST-safe.
CREATE TABLE IF NOT EXISTS ride_wait_samples (
park_id TEXT NOT NULL,
qt_ride_id INTEGER NOT NULL,
recorded_at TEXT NOT NULL, -- ISO UTC
local_date TEXT NOT NULL, -- YYYY-MM-DD in park tz
local_time TEXT NOT NULL, -- HH:MM in park tz
is_open INTEGER NOT NULL,
wait_minutes INTEGER, -- Regular line wait
fast_lane_minutes INTEGER, -- Six Flags Fast Lane wait, if known
PRIMARY KEY (park_id, qt_ride_id, recorded_at)
);
``` ```
- **Composite primary key** `(park_id, date)` ensures one row per park per day and supports efficient queries without secondary indexes. - **Composite primary keys** ensure one row per logical unit (per park-day, per ride, per sample) and support efficient queries without secondary indexes. `idx_rides_slug` lets the ride detail route resolve a `slug` to a `qt_ride_id` in one lookup.
- **WAL mode** (`PRAGMA journal_mode = WAL`) enables concurrent reads while the scraper writes. - **WAL mode** (`PRAGMA journal_mode = WAL`) enables concurrent reads while the scraper writes.
- **Migration strategy**: New columns are added via `ALTER TABLE ... ADD COLUMN` wrapped in try/catch. If the column already exists, the error is silently caught. This allows the schema to evolve without a migration framework. - **Migration strategy**: New columns are added via `ALTER TABLE ... ADD COLUMN` wrapped in try/catch. If the column already exists, the error is silently caught. This allows the schema to evolve without a migration framework.
- **Sample volume**: Tier-5 writes one row per open ride every 5 minutes during park hours. A park with 50 rides operating for 10 hours generates ~6,000 sample rows/day. `INSERT OR IGNORE` on the PK makes the sampler idempotent across retries.
### Key Queries ### Key Queries
@@ -278,14 +338,24 @@ CREATE TABLE IF NOT EXISTS park_days (
| `getDateRange(start, end)` | Returns all parks' data for a date range. Powers the week calendar. | | `getDateRange(start, end)` | Returns all parks' data for a date range. Powers the week calendar. |
| `getParkMonthData(parkId, year, month)` | Returns one park's data for a month. Uses `LIKE` prefix matching on date. | | `getParkMonthData(parkId, year, month)` | Returns one park's data for a month. Uses `LIKE` prefix matching on date. |
| `getDayData(parkId, date)` | Returns a single day for comparison during `scrapeToday()`. | | `getDayData(parkId, date)` | Returns a single day for comparison during `scrapeToday()`. |
| `getParkDayCount()` | Total rows in `park_days`. Drives the startup-scrape-when-empty check. |
| `isMonthScraped(parkId, year, month, staleAfterMs)` | Checks if `MAX(scraped_at)` for a park-month is within the staleness threshold. Past months always return `true` (never re-scraped). | | `isMonthScraped(parkId, year, month, staleAfterMs)` | Checks if `MAX(scraped_at)` for a park-month is within the staleness threshold. Past months always return `true` (never re-scraped). |
| `upsertRide()` | Insert or update a row in `rides`; bumps `last_seen` on every observation. |
| `getRideBySlug(parkId, slug)` | Resolves a URL slug back to a canonical ride record via `idx_rides_slug`. |
| `insertSample()` | `INSERT OR IGNORE` a sample into `ride_wait_samples` — idempotent on retries. |
| `getRideSamplesForDay()` | Returns all samples for one ride on one local date (powers the Today chart). |
| `getRideDailyAggregates()` | Per-day avg/max wait, avg/max Fast Lane, uptime %, and sample count over a window (powers the 7d / 30d charts). |
| `countRideDays()` | Number of distinct `local_date` values for a ride in a window — used to decide whether 7d/30d tabs have enough data to render. |
| `transact(fn)` | Wraps a function in a SQLite transaction for atomicity. | | `transact(fn)` | Wraps a function in a SQLite transaction for atomicity. |
### Storage ### Storage
- **Location**: `backend/data/parks.db` (or `/app/backend/data/parks.db` in Docker) - **Location**: `backend/data/parks.db` (or `/app/backend/data/parks.db` in Docker)
- **WAL journal files**: `parks.db-wal` and `parks.db-shm` accompany the main database - **WAL journal files**: `parks.db-wal` and `parks.db-shm` accompany the main database
- **Size**: Approximately 8,000-9,000 rows for a full year of 24 parks - **Size**:
- `park_days`: ~8,000-9,000 rows for a full year of 24 parks
- `rides`: ~1,000-1,500 rows total (a few dozen per park)
- `ride_wait_samples`: grows daily during operating season; expect tens of thousands of rows per active day. Historical samples are retained — no automatic pruning is configured.
- **Not committed to git**: Listed in `.gitignore` - **Not committed to git**: Listed in `.gitignore`
- **Auto-created**: The database and `data/` directory are created on first backend startup - **Auto-created**: The database and `data/` directory are created on first backend startup
@@ -335,6 +405,21 @@ interface ApiDay {
- Handles buyouts: if `isBuyout` is true and it's not a passholder preview, the park is considered closed - Handles buyouts: if `isBuyout` is true and it's not a passholder preview, the park is considered closed
- Returns `{ date, isOpen, hoursLabel, specialType }` - Returns `{ date, isOpen, hoursLabel, specialType }`
### Six Flags Wait-Times API
Powers the Fast Lane wait number shown alongside the regular wait. Used by `lib/scrapers/sixflags-waittimes.ts` (`fetchFastLaneWaits`, `lookupFastLane`) and joined onto the Queue-Times rides by fuzzy name match.
| Property | Value |
|----------|-------|
| URL | `https://d18car1k0ff81h.cloudfront.net/wait-times/park/{apiId}` (sibling of the operating-hours endpoint) |
| Auth | None (spoofed browser headers, same as the operating-hours client) |
| Timeout | 10 seconds |
| Per-ride fields | `regularMinutes`, `fastLaneMinutes`, `hasFastLane` (`lookupFastLane()` return shape) |
| Error handling | Returns `null` on any failure; the route falls back to Queue-Times' regular wait |
| Backend cache | `fastLaneCache` (5-min TTL, in `services/live-cache.ts`) |
**Why two sources?** Queue-Times wait values lag at park open by ~10-15 minutes (parks haven't reported yet). The Six Flags wait-times feed updates earlier. When both sources have a wait for the same ride, the route prefers the Six Flags regular wait; Queue-Times remains the source of truth for `isOpen`. The Fast Lane number has no Queue-Times equivalent.
### Queue-Times.com API ### Queue-Times.com API
Provides live ride open/closed status and wait times during park operating hours. Provides live ride open/closed status and wait times during park operating hours.
@@ -384,16 +469,20 @@ Rides are classified as roller coasters using static data from the Roller Coaste
|-----------|------|------| |-----------|------|------|
| `app/page.tsx` | Server | Fetches week data from backend, passes to HomePageClient | | `app/page.tsx` | Server | Fetches week data from backend, passes to HomePageClient |
| `app/park/[id]/page.tsx` | Server | Fetches month + rides data in parallel | | `app/park/[id]/page.tsx` | Server | Fetches month + rides data in parallel |
| `app/park/[id]/ride/[slug]/page.tsx` | Server | Fetches ride detail + today/7d/30d history in one call |
| `HomePageClient` | **Client** | State management, auto-refresh, keyboard nav, localStorage | | `HomePageClient` | **Client** | State management, auto-refresh, keyboard nav, localStorage |
| `WeekCalendar` | Server | Desktop 7-column table layout | | `WeekCalendar` | Server | Desktop 7-column table layout |
| `MobileCardList` | Server | Mobile card layout | | `MobileCardList` | Server | Mobile card layout |
| `ParkCard` | Server | Individual park card for mobile | | `ParkCard` | Server | Individual park card for mobile |
| `ParkMonthCalendar` | Server | Month calendar grid | | `ParkMonthCalendar` | Server | Month calendar grid |
| `LiveRidePanel` | **Client** | Live ride list with coaster filter toggle | | `LiveRidePanel` | **Client** | Live ride list with coaster filter + Fast Lane toggle |
| `WeekNav` | **Client** | Week navigation with arrow buttons | | `WeekNav` | **Client** | Week navigation with arrow buttons |
| `Legend` | Server | Status color legend | | `Legend` | Server | Status color legend |
| `EmptyState` | Server | Empty database message | | `EmptyState` | Server | Empty database message |
| `BackToCalendarLink` | **Client** | "Back" link using localStorage for last week | | `BackToCalendarLink` | **Client** | "Back" link using localStorage for last week |
| `charts/WaitTimeTodayChart` | **Client** | Today's 5-min wait samples + outage shading (Recharts) |
| `charts/WeeklyStatsChart` | **Client** | 7d / 30d daily aggregates chart (Recharts) |
| `charts/UptimePill` | **Client** | Compact uptime % badge |
### Component Hierarchy ### Component Hierarchy
@@ -410,6 +499,12 @@ park/[id]/page.tsx (Server)
├── BackToCalendarLink (Client) ├── BackToCalendarLink (Client)
├── ParkMonthCalendar (Server) ├── ParkMonthCalendar (Server)
└── LiveRidePanel (Client) ........... or RideList (Server, inline) └── LiveRidePanel (Client) ........... or RideList (Server, inline)
park/[id]/ride/[slug]/page.tsx (Server)
├── BackToCalendarLink (Client)
├── UptimePill (Client)
├── WaitTimeTodayChart (Client) ...... Today tab
└── WeeklyStatsChart (Client) ........ 7d / 30d tabs
``` ```
### Client-Side Refresh ### Client-Side Refresh
@@ -496,4 +591,7 @@ interface Park {
| Non-root containers | Both Docker images run as `nextjs` user (UID 1001) | | Non-root containers | Both Docker images run as `nextjs` user (UID 1001) |
| Backend-owned data | Frontend never contacts external APIs or the database directly | | Backend-owned data | Frontend never contacts external APIs or the database directly |
| CORS | Backend enables CORS middleware (currently unrestricted) | | CORS | Backend enables CORS middleware (currently unrestricted) |
| Per-IP rate limit | `RATE_LIMIT_PER_MIN` (default 60) — fixed-window per-IP counter in `backend/src/middleware/rate-limit.ts`. Honours `x-forwarded-for`/`x-real-ip` so a reverse proxy doesn't collapse every client to one bucket. Over-limit requests return `429` with a `Retry-After` header. |
| Env validation | `backend/src/config.ts` parses + validates env vars at startup; misconfiguration fails fast rather than surfacing in a request handler. |
| Graceful shutdown | Backend listens for `SIGTERM`/`SIGINT`, closes the HTTP server and SQLite handle before exiting (force-exit timeout as a safety net). |
| No secrets in frontend | `BACKEND_URL` is an internal Docker network address, not a secret | | No secrets in frontend | `BACKEND_URL` is an internal Docker network address, not a secret |
+49 -17
View File
@@ -64,14 +64,15 @@ This starts the Next.js dev server on port 3000 with hot reload. Open [http://lo
### `app/` -- Next.js Pages ### `app/` -- Next.js Pages
Two routes: Three routes:
- `/` (`app/page.tsx`) -- Home page. Server component that fetches week data from the backend and passes everything to `HomePageClient`. - `/` (`app/page.tsx`) -- Home page. Server component that fetches week data from the backend and passes everything to `HomePageClient`.
- `/park/[id]` (`app/park/[id]/page.tsx`) -- Park detail page. Fetches month calendar and live rides in parallel via `Promise.all`. - `/park/[id]` (`app/park/[id]/page.tsx`) -- Park detail page. Fetches month calendar and live rides in parallel via `Promise.all`. Live rides use `apiFetch({ noStore: true })` to bypass the Next.js Data Cache.
- `/park/[id]/ride/[slug]` (`app/park/[id]/ride/[slug]/page.tsx`) -- Per-ride detail page with Today / 7d / 30d wait-time history. All three tabs render from a single backend response (no client-side range fetches).
Top-level boundaries: `app/error.tsx` (root error UI), `app/not-found.tsx`, `app/park/[id]/error.tsx`, and `app/loading.tsx` (streaming skeleton).
### `components/` -- React Components ### `components/` -- React Components
10 components, split between server and client:
| Component | Type | Purpose | | Component | Type | Purpose |
|-----------|------|---------| |-----------|------|---------|
| `HomePageClient` | Client | Top-level state: coaster filter, auto-refresh, keyboard nav | | `HomePageClient` | Client | Top-level state: coaster filter, auto-refresh, keyboard nav |
@@ -79,11 +80,14 @@ Two routes:
| `MobileCardList` | Server | Mobile card layout (below `lg` breakpoint) | | `MobileCardList` | Server | Mobile card layout (below `lg` breakpoint) |
| `ParkCard` | Server | Individual park card for mobile | | `ParkCard` | Server | Individual park card for mobile |
| `ParkMonthCalendar` | Server | Month grid for park detail page | | `ParkMonthCalendar` | Server | Month grid for park detail page |
| `LiveRidePanel` | Client | Live ride list with coaster toggle and wait times | | `LiveRidePanel` | Client | Live ride list with coaster toggle, Fast Lane toggle, wait times |
| `WeekNav` | Client | Week navigation arrows | | `WeekNav` | Client | Week navigation arrows |
| `Legend` | Server | Color legend for status indicators | | `Legend` | Server | Color legend for status indicators |
| `EmptyState` | Server | Empty database message | | `EmptyState` | Server | Empty database message |
| `BackToCalendarLink` | Client | Back link using localStorage for last week | | `BackToCalendarLink` | Client | Back link using localStorage for last week |
| `charts/WaitTimeTodayChart` | Client | Today's 5-min wait samples with outage shading (Recharts) |
| `charts/WeeklyStatsChart` | Client | 7d / 30d daily aggregate chart (Recharts) |
| `charts/UptimePill` | Client | Compact uptime % badge |
### `lib/` -- Shared Code ### `lib/` -- Shared Code
@@ -97,24 +101,36 @@ Imported by both frontend and backend:
| `coaster-data.ts` | Static RCDB coaster name sets per park, `getCoasterSet()` | | `coaster-data.ts` | Static RCDB coaster name sets per park, `getCoasterSet()` |
| `coaster-match.ts` | `normalizeForMatch()`, `isCoasterMatch()` -- fuzzy name matching | | `coaster-match.ts` | `normalizeForMatch()`, `isCoasterMatch()` -- fuzzy name matching |
| `queue-times-map.ts` | `QUEUE_TIMES_IDS` -- park ID to Queue-Times park ID mapping | | `queue-times-map.ts` | `QUEUE_TIMES_IDS` -- park ID to Queue-Times park ID mapping |
| `scrapers/sixflags.ts` | Six Flags CloudFront API client -- `scrapeMonth()`, `fetchToday()`, `scrapeRidesForDay()`, rate limiting | | `api.ts` | `apiFetch<T>()` -- typed fetch helper with `revalidate` or `noStore` option |
| `outage.ts` | `computeOutages()` -- detects contiguous closed-during-hours runs for the today chart |
| `ride-slug.ts` | `slugifyRideName()` -- URL slug used by `/park/[id]/ride/[slug]` and the `rides` table |
| `timezone.ts` | `formatLocalDate()`, `formatLocalTime()` for bucketing samples in a park's IANA tz |
| `scrapers/sixflags.ts` | Six Flags CloudFront operating-hours client -- `scrapeMonth()`, `fetchToday()`, `scrapeRidesForDay()`, rate limiting |
| `scrapers/sixflags-waittimes.ts` | Six Flags Fast Lane wait-times client -- `fetchFastLaneWaits()`, `lookupFastLane()` |
| `scrapers/queuetimes.ts` | Queue-Times.com API client -- `fetchLiveRides()` | | `scrapers/queuetimes.ts` | Queue-Times.com API client -- `fetchLiveRides()` |
| `scrapers/log.ts` | Shared scraper logger (used by both `sixflags.ts` and `sixflags-waittimes.ts`) |
| `scrapers/types.ts` | `Park`, `DayStatus`, `MonthCalendar`, `ScraperAdapter` interfaces | | `scrapers/types.ts` | `Park`, `DayStatus`, `MonthCalendar`, `ScraperAdapter` interfaces |
### `backend/src/` -- Hono API Server ### `backend/src/` -- Hono API Server
| File | Purpose | | File | Purpose |
|------|---------| |------|---------|
| `index.ts` | Entry point -- middleware (CORS, logger), route registration, DB init, scheduler start | | `index.ts` | Entry point -- middleware (request log, CORS, rate limit), route registration, DB init, scheduler start, graceful shutdown |
| `db/index.ts` | SQLite connection singleton, schema creation, WAL mode | | `config.ts` | Env-validated config object (`PORT`, `RATE_LIMIT_PER_MIN`, `PARK_HOURS_STALENESS_HOURS`, `NODE_ENV`). Fails fast on bad input. |
| `db/queries.ts` | All SQL queries -- `upsertDay`, `getDateRange`, `getParkMonthData`, `isMonthScraped`, etc. | | `log.ts` | Structured logger -- emits `[ISO] [LEVEL] [tag] msg key=value` lines. No external dep. |
| `db/index.ts` | SQLite connection singleton, schema for `park_days` / `rides` / `ride_wait_samples`, WAL mode |
| `db/queries.ts` | All SQL queries -- `upsertDay`, `getDateRange`, `isMonthScraped`, `upsertRide`, `getRideBySlug`, `insertSample`, `getRideSamplesForDay`, `getRideDailyAggregates`, `countRideDays`, `getParkDayCount`, `transact` |
| `middleware/rate-limit.ts` | Fixed-window per-IP limiter. Honours `x-forwarded-for` / `x-real-ip`. Returns 429 with `Retry-After`. |
| `routes/calendar.ts` | `/api/calendar/*` -- week and month data with live today merging | | `routes/calendar.ts` | `/api/calendar/*` -- week and month data with live today merging |
| `routes/parks.ts` | `/api/parks/*` -- park metadata | | `routes/parks.ts` | `/api/parks/*` -- park metadata |
| `routes/rides.ts` | `/api/parks/:id/rides` -- live ride status with schedule fallback | | `routes/rides.ts` | `/api/parks/:id/rides` -- live ride status + Fast Lane join + schedule fallback |
| `routes/ride-history.ts` | `/api/parks/:id/rides/:slug` -- ride detail + today/7d/30d history in one payload |
| `routes/status.ts` | `/api/status` -- health check | | `routes/status.ts` | `/api/status` -- health check |
| `routes/scrape.ts` | `/api/scrape/trigger` -- manual scrape | | `routes/scrape.ts` | `/api/scrape/trigger` -- manual scrape |
| `services/scheduler.ts` | Four-tier cron job registration | | `services/scheduler.ts` | Five-tier cron registration with per-tier `withLatch` concurrency guards; startup-scrape-when-empty check |
| `services/scraper.ts` | Scraping orchestration -- `scrapeToday()`, `scrapeMonths()`, `scrapeFullYear()` | | `services/scraper.ts` | Scraping orchestration -- `scrapeToday()`, `scrapeMonths()`, `scrapeFullYear()` |
| `services/wait-sampler.ts` | Tier-5 5-minute sampler -- joins Queue-Times + Fast Lane, writes `ride_wait_samples`, skips weather-delayed parks |
| `services/live-cache.ts` | Shared `TtlCache<T>` instances (`liveRidesCache`, `fastLaneCache`) so the rides route, the ride-history route, and the Tier-5 sampler share warmed upstream data |
| `services/cache.ts` | Generic `TtlCache<T>` class with configurable TTL | | `services/cache.ts` | Generic `TtlCache<T>` class with configurable TTL |
--- ---
@@ -204,19 +220,35 @@ This fetches the raw Six Flags API response for the park and date, displays the
## Testing ## Testing
Frontend and backend each have their own test suite, both using the Node built-in test runner.
### Frontend tests
```bash ```bash
npm test npm test
``` ```
Uses the **Node.js built-in test runner** (`node --test`). Test files live in `tests/`. Test files live in `tests/`:
**Current test coverage:** | File | Coverage |
|------|----------|
| `tests/coaster-matching.test.ts` | `isCoasterMatch()` — exact, prefix, compact, conjunction rejection |
| `tests/fast-lane-matching.test.ts` | `lookupFastLane()` — name normalization and Fast Lane join logic |
| `tests/outage-detection.test.ts` | `computeOutages()` — contiguous-closed-run detection for the today chart |
| `tests/ride-slug.test.ts` | `slugifyRideName()` — URL slug generation and stability |
| `tests/timezone-bucketing.test.ts` | `formatLocalDate()` / `formatLocalTime()` — DST-safe park-tz bucketing |
| File | Tests | Coverage | ### Backend tests
|------|-------|---------|
| `tests/coaster-matching.test.ts` | 13 cases | Coaster name matching: exact, prefix, compact, conjunction rejection |
Tests verify the `isCoasterMatch()` function handles edge cases like trademark symbols, possessives, subtitles, space-split brand words, and conjunction-joined compound ride names. ```bash
cd backend && npm test
```
Test files live in `backend/tests/`:
| File | Coverage |
|------|----------|
| `backend/tests/wait-aggregation.test.ts` | SQL aggregation in `getRideDailyAggregates()` — averages, max, uptime, sample count |
--- ---
+54 -25
View File
@@ -116,9 +116,12 @@ volumes:
|----------|---------|-------------| |----------|---------|-------------|
| `TZ` | `UTC` | Process timezone. Controls when cron jobs fire. Set to `America/New_York` in production so schedules align with US Eastern parks. | | `TZ` | `UTC` | Process timezone. Controls when cron jobs fire. Set to `America/New_York` in production so schedules align with US Eastern parks. |
| `PARK_HOURS_STALENESS_HOURS` | `72` | Hours before park schedule data is considered stale and re-fetched. Lower values increase API load; higher values increase data lag. | | `PARK_HOURS_STALENESS_HOURS` | `72` | Hours before park schedule data is considered stale and re-fetched. Lower values increase API load; higher values increase data lag. |
| `RATE_LIMIT_PER_MIN` | `60` | Per-IP request limit for the public API. Over-limit requests return `429 Too Many Requests` with a `Retry-After` header. Enforced by `backend/src/middleware/rate-limit.ts`. Behind a proxy, ensure `x-forwarded-for` is set or every client looks like the proxy IP. |
| `NODE_ENV` | -- | Set to `production` in Docker. | | `NODE_ENV` | -- | Set to `production` in Docker. |
| `PORT` | `3001` | Server listen port. | | `PORT` | `3001` | Server listen port. |
`backend/src/config.ts` parses and validates these at startup. A bad value (e.g. `PORT=foo`) fails fast with a thrown `Error` rather than surfacing in a request handler later.
--- ---
## CI/CD Pipeline ## CI/CD Pipeline
@@ -167,9 +170,10 @@ These are configured in the Gitea repository settings under **Settings > Actions
3. **Verify the backend started:** 3. **Verify the backend started:**
```bash ```bash
docker compose logs backend docker compose logs backend
# Look for: [backend] database initialized # Look for (structured log lines, see the Log Reference section):
# [scheduler] cron jobs registered # [INFO] [startup] database initialized
# [backend] listening on http://localhost:3001 # [INFO] [scheduler] cron jobs registered ...
# [INFO] [startup] listening url=http://localhost:3001
``` ```
4. **Check database status (will be empty on first run):** 4. **Check database status (will be empty on first run):**
@@ -251,7 +255,7 @@ Backups are recommended for continuity (avoiding the 5-10 minute re-scrape windo
### Tiered Cron Schedule ### Tiered Cron Schedule
The backend runs four scraping tiers via `node-cron`: The backend runs five scraping tiers via `node-cron`:
| Tier | Cron Expression | Schedule | Scope | Delay | | Tier | Cron Expression | Schedule | Scope | Delay |
|------|-----------------|----------|-------|-------| |------|-----------------|----------|-------|-------|
@@ -259,10 +263,24 @@ The backend runs four scraping tiers via `node-cron`:
| 2 | `0 */6 * * *` | Every 6 hours | Current month for all parks | 1000ms | | 2 | `0 */6 * * *` | Every 6 hours | Current month for all parks | 1000ms |
| 3 | `0 3,15 * * *` | 3 AM and 3 PM | Current + next month | 1000ms | | 3 | `0 3,15 * * *` | 3 AM and 3 PM | Current + next month | 1000ms |
| 4 | `0 3 * * *` | Daily at 3 AM | Full year (all 12 months) | 1000ms | | 4 | `0 3 * * *` | Daily at 3 AM | Full year (all 12 months) | 1000ms |
| 5 | `*/5 * * * *` | Every 5 minutes | Wait-time samples for currently-open parks into `ride_wait_samples` | parallel chunks of 6 |
**Staleness:** Tiers 2-4 skip any park-month that was scraped within `PARK_HOURS_STALENESS_HOURS` (default 72h). Tier 1 always fetches (uses diff-before-write instead). **Staleness:** Tiers 2-4 skip any park-month that was scraped within `PARK_HOURS_STALENESS_HOURS` (default 72h). Tier 1 always fetches (uses diff-before-write instead). Tier 5 only samples parks whose `park_days` row marks them open today *and* whose current local time is inside the operating window (with a 1-hour closing buffer).
**Off-season:** Tier 1 only runs from March through December. The month constraint `3-12` in the cron expression skips January and February when most parks are closed. **Off-season:** Tier 1 only runs from March through December. The month constraint `3-12` in the cron expression skips January and February when most parks are closed. Tier 5 runs year-round but is effectively a no-op when no parks are open.
**Concurrency latches:** Every tier is wrapped in `withLatch()` (see `backend/src/services/scheduler.ts`). If a tick is still running when the next would fire, the new tick is *skipped* and logged with a `previous run still in progress` warning rather than stacking. Each tier has its own latch so a slow Tier-4 doesn't block Tier-5's 5-minute cadence.
**Weather-delayed parks skipped from sampling:** Tier 5 detects the "rides exist but all closed during scheduled hours" case and skips writes for that park, so a storm doesn't poison the uptime statistics with hours of `is_open=0` samples.
### Startup Behavior
On boot, the scheduler checks `getParkDayCount()` against a threshold of 50 rows:
- **Empty / nearly-empty database** (< 50 rows): runs `scrapeToday()` followed by `scrapeFullYear()` in sequence. Logs `[scheduler.startup]` lines for each phase.
- **Populated database** (≥ 50 rows): skips the startup scrape and relies on cron tiers. Logs `skipping startup scrape — relying on cron`.
This replaces the earlier behavior of full-scraping on every container start, which doubled outbound API load and delayed readiness on every deploy.
### Timezone Sensitivity ### Timezone Sensitivity
@@ -374,7 +392,7 @@ curl http://localhost:3001/api/status
```bash ```bash
docker compose logs backend --tail 50 docker compose logs backend --tail 50
``` ```
Look for `[backend] listening on http://localhost:3001`. Look for an `[INFO] [startup] listening url=http://localhost:3001` line.
2. **Check if the database has data:** 2. **Check if the database has data:**
```bash ```bash
@@ -452,28 +470,39 @@ If the database becomes corrupted (unlikely with SQLite WAL mode, but possible a
## Log Reference ## Log Reference
| Prefix | Source | Meaning | The backend uses a small structured logger (`backend/src/log.ts`). Every line has the format:
|--------|--------|---------|
| `[backend]` | `index.ts` | Startup messages: DB initialized, server listening | ```
| `[scheduler]` | `scheduler.ts` | Cron job triggers with tier number | <ISO timestamp> [<LEVEL>] [<tag>] <message> key1=value1 key2=value2 …
| `[today]` | `scraper.ts` | Per-park results for the today tier (updated/skipped/error) | ```
| `[month]` | `scraper.ts` | Per-park-month results (open days count, rate limited, errors) |
| `[rate-limited]` | `sixflags.ts` | HTTP 429/503 with backoff timing and retry attempt count | Levels are `INFO`, `WARN`, `ERROR`. `ERROR` writes to stderr; the others write to stdout. Grep-friendly: filter by tag (`grep '\[scheduler.tier1\]'`) or by key (`grep 'park=cedarpoint'`).
| Tag | Source | Meaning |
|-----|--------|---------|
| `startup` | `index.ts` | Config loaded, DB initialized, server listening |
| `shutdown` | `index.ts` | `SIGTERM`/`SIGINT` received; graceful shutdown progress |
| `http` | `index.ts` | One line per request: `method`, `path`, `status`, `ms` |
| `scheduler` | `scheduler.ts` | Cron job registration summary on boot |
| `scheduler.tier1` … `scheduler.tier5` | `scheduler.ts` | Each tier's tick; includes skip-due-to-latch warnings |
| `scheduler.startup` | `scheduler.ts` | Result of the "database empty" startup scrape |
| `today` / `month` | `scraper.ts` | Per-park / per-month scrape results |
| `wait-sampler` | `wait-sampler.ts` | Tier-5 per-park sample writes, errors, weather-delay skips |
| `rate-limit` | `middleware/rate-limit.ts` | `blocked` event with `ip`, `count`, `retryAfter` |
| `rides` | `routes/rides.ts` | Per-request warnings when upstream calls fail |
| `rate-limited` | `lib/scrapers/sixflags.ts` | HTTP 429/503 from Six Flags with backoff timing |
**Example log output:** **Example log output:**
``` ```
[backend] database initialized 2026-04-23T14:00:00.012Z [INFO] [startup] config loaded port=3001 nodeEnv=production parkHoursStalenessHours=72 rateLimitPerMin=60
[scheduler] cron jobs registered 2026-04-23T14:00:00.034Z [INFO] [startup] database initialized
tier-1: today — hourly (Mar-Dec) 2026-04-23T14:00:00.041Z [INFO] [scheduler] cron jobs registered tiers="tier1=hourly(Mar-Dec) tier2=6h tier3=3am+3pm tier4=3am-daily tier5=5min"
tier-2: current month — every 6h 2026-04-23T14:00:00.042Z [INFO] [scheduler] skipping startup scrape — relying on cron existingRows=8742
tier-3: upcoming — 3 AM + 3 PM 2026-04-23T14:00:00.045Z [INFO] [startup] listening url=http://localhost:3001
tier-4: full year — 3 AM daily 2026-04-23T14:00:00.123Z [INFO] [http] GET /api/calendar/week status=200 ms=18
[backend] listening on http://localhost:3001 2026-04-23T14:00:10.001Z [INFO] [scheduler.tier1] scraping today
[scheduler] tier-1: scraping today @ 2026-04-23T14:00:00.000Z 2026-04-23T14:05:00.001Z [INFO] [scheduler.tier5] sample run complete parksSampled=14 parksSkipped=10 samplesWritten=612 weatherDelayed=0 errors=0
[today] Great Adventure: updated (open 10am - 6pm)
[today] Cedar Point: updated (open 10am - 8pm)
[today] done: 24 fetched, 3 updated, 0 skipped, 0 errors
``` ```
--- ---