Files

T

Build and Deploy / Build & Push (push) Successful in 1m31s

Details

docs: add comprehensive project documentation

Add docs/ folder with architecture, operations, API reference, and
development guides covering system design, deployment, troubleshooting,
all backend endpoints, and contributor workflows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-23 22:15:02 -04:00

17 KiB

Raw Blame History

Operations

See also: Architecture | API Reference | Development

Deployment Overview

The application runs as two Docker containers:

Container	Port	Role
`web`	3000	Next.js frontend (stateless, no database)
`backend`	3001	Hono API server (owns SQLite database, runs cron scheduler)

Infrastructure requirements:

Docker host with Docker Compose
Container registry (Gitea, Docker Hub, or any OCI-compatible registry)
Outbound HTTPS access to d18car1k0ff81h.cloudfront.net (Six Flags API) and queue-times.com (live ride data)
A reverse proxy (Traefik, nginx, Caddy, etc.) is expected to sit in front for TLS termination and domain routing, but is not included in this repository

See Architecture for detailed system design.

Docker Images

Multi-Stage Build

The project uses a single Dockerfile with four stages producing two final images:

  builder           backend-deps
  (Next.js build)   (native modules)
      |                   |
      v                   v
    web               backend
  (final)             (final)

Stage	Base	Purpose
`builder`	`node:22-bookworm-slim`	`npm ci` + `npm run build` -- produces Next.js standalone output
`backend-deps`	`node:22-bookworm-slim`	Installs `python3`, `make`, `g++` for `better-sqlite3` native compilation, then `npm ci`
`web` (final)	`node:22-bookworm-slim`	Copies standalone output from `builder`. Non-root user. ~150MB.
`backend` (final)	`node:22-bookworm-slim`	Copies `node_modules` from `backend-deps` + source code. Volume for SQLite. Non-root user. ~200MB.

Image Tags

{registry}/{owner}/sixflagssupercalendar:web
{registry}/{owner}/sixflagssupercalendar:backend

Building Locally

# Build web image
docker build --target web -t sixflagssupercalendar:web .

# Build backend image
docker build --target backend -t sixflagssupercalendar:backend .

Docker Compose

The production docker-compose.yml:

services:
  web:
    image: gitea.thewrightserver.net/josh/sixflagssupercalendar:web
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - BACKEND_URL=http://backend:3001    # Docker internal networking
    restart: unless-stopped

  backend:
    image: gitea.thewrightserver.net/josh/sixflagssupercalendar:backend
    ports:
      - "3001:3001"
    volumes:
      - park_data:/app/data                # SQLite database persistence
    environment:
      - NODE_ENV=production
      - TZ=America/New_York                # Timezone for cron schedules
      - PARK_HOURS_STALENESS_HOURS=72      # Hours before re-fetching park data
    restart: unless-stopped

volumes:
  park_data:                               # Named volume for database files

Networking: The web container reaches the backend via Docker's internal DNS at http://backend:3001. The backend port is also exposed to the host for manual API access during troubleshooting.

Environment Variables

Web Container

Variable	Default	Description
`BACKEND_URL`	`http://localhost:3001`	Backend API base URL. Set to `http://backend:3001` in Docker Compose for internal networking.
`NODE_ENV`	--	Set to `production` in Docker.
`NEXT_TELEMETRY_DISABLED`	`1`	Disables Next.js telemetry (set in Dockerfile).
`PORT`	`3000`	Server listen port (set in Dockerfile).
`HOSTNAME`	`0.0.0.0`	Bind address (set in Dockerfile to allow external access).

Backend Container

Variable	Default	Description
`TZ`	`UTC`	Process timezone. Controls when cron jobs fire. Set to `America/New_York` in production so schedules align with US Eastern parks.
`PARK_HOURS_STALENESS_HOURS`	`72`	Hours before park schedule data is considered stale and re-fetched. Lower values increase API load; higher values increase data lag.
`NODE_ENV`	--	Set to `production` in Docker.
`PORT`	`3001`	Server listen port.

CI/CD Pipeline

Gitea Actions Workflow

File: .gitea/workflows/deploy.yml

Trigger: Push to main branch.

Steps:

Checkout code (actions/checkout@v4)
Log in to Gitea container registry
Build and push web image (docker/build-push-action@v6, target: web)
Build and push backend image (docker/build-push-action@v6, target: backend)

Required Configuration

Type	Name	Description
Variable	`REGISTRY`	Container registry URL (e.g. `gitea.thewrightserver.net`)
Secret	`REGISTRY_TOKEN`	Authentication token for the registry

These are configured in the Gitea repository settings under Settings > Actions > Secrets and Settings > Actions > Variables.

Setting Up CI/CD from Scratch

Create a Gitea repository
Add REGISTRY as a repository variable (Settings > Actions > Variables)
Add REGISTRY_TOKEN as a repository secret (Settings > Actions > Secrets)
Push to main -- the workflow triggers automatically
Pull images on your Docker host: docker compose pull && docker compose up -d

Initial Deployment Checklist

Create a docker-compose.yml on your Docker host (see the Docker Compose section above, or use the one from the repository).

Pull and start the containers:

docker compose pull
docker compose up -d

Verify the backend started:

docker compose logs backend
# Look for: [backend] database initialized
#           [scheduler] cron jobs registered
#           [backend] listening on http://localhost:3001

Check database status (will be empty on first run):

curl http://localhost:3001/api/status
# { "status": "ok", "database": { "totalDays": 0, ... } }

Trigger the initial data scrape:
```
curl -X POST http://localhost:3001/api/scrape/trigger?scope=full
```
This scrapes all 12 months for all 24 parks with a 1-second delay between parks. Expected duration: 5-10 minutes.

Verify data was scraped:

curl http://localhost:3001/api/status
# totalDays should be ~8000-9000

Open the web UI: Navigate to http://your-host:3000.

The cron scheduler starts automatically and will keep data fresh going forward.

Updating

docker compose pull && docker compose up -d

The SQLite database lives in a named Docker volume (park_data), so it persists across container recreations.
Schema migrations are applied automatically on backend startup. New columns are added via ALTER TABLE ... ADD COLUMN wrapped in try/catch -- if the column already exists, the error is silently caught.
No manual migration steps are needed.

Backup and Restore

What to Back Up

The SQLite database at /app/data/parks.db inside the park_data Docker volume. WAL journal files (parks.db-wal and parks.db-shm) must be included for a consistent backup.

Backup Methods

Method 1: Copy from the container

docker compose cp backend:/app/data/parks.db ./backup/parks.db
docker compose cp backend:/app/data/parks.db-wal ./backup/parks.db-wal 2>/dev/null
docker compose cp backend:/app/data/parks.db-shm ./backup/parks.db-shm 2>/dev/null

Method 2: Mount the volume to the host Add a bind mount in docker-compose.yml:

volumes:
  - ./data:/app/data

Restore

Stop the backend: docker compose stop backend
Replace the database files in the volume
Restart: docker compose start backend

Note on Reproducibility

All data is sourced from external APIs and is fully reproducible. If the database is lost, simply restart the backend (which auto-creates an empty database) and trigger a full scrape:

curl -X POST http://localhost:3001/api/scrape/trigger?scope=full

Backups are recommended for continuity (avoiding the 5-10 minute re-scrape window) but are not critical.

Scheduler Operations

Tiered Cron Schedule

The backend runs four scraping tiers via node-cron:

Tier	Cron Expression	Schedule	Scope	Delay
1	`0 * * 3-12 *`	Hourly, March through December	Today's hours for all parks	500ms
2	`0 /6 * *`	Every 6 hours	Current month for all parks	1000ms
3	`0 3,15 * * *`	3 AM and 3 PM	Current + next month	1000ms
4	`0 3 * * *`	Daily at 3 AM	Full year (all 12 months)	1000ms

Staleness: Tiers 2-4 skip any park-month that was scraped within PARK_HOURS_STALENESS_HOURS (default 72h). Tier 1 always fetches (uses diff-before-write instead).

Off-season: Tier 1 only runs from March through December. The month constraint 3-12 in the cron expression skips January and February when most parks are closed.

Timezone Sensitivity

Cron expressions execute in the process timezone, controlled by the TZ environment variable. In production this is set to America/New_York so that "3 AM" aligns with US Eastern time.

The per-park timezone (e.g. America/Los_Angeles for Magic Mountain) is used separately for operating window detection -- it does not affect cron schedule timing.

The 3 AM Switchover

getTodayLocal() in lib/env.ts implements a 3 AM local-time switchover: before 3 AM, the system considers it "yesterday." This prevents the calendar from flipping to the next day at midnight while park visitors are still out. The switchover uses the server's local time (influenced by TZ), not individual park timezones.

Manual Scraping

Trigger a scrape at any time via the backend API:

curl -X POST http://localhost:3001/api/scrape/trigger?scope=<scope>

Scope Options

Scope	Behavior	Duration
`today`	Fetches today's hours for all 24 parks. Diffs against database before writing. 500ms delay.	~15s
`month`	Current month for all parks. Respects staleness window. 1000ms delay.	~30s
`upcoming`	Current + next month. Respects staleness.	~1min
`full`	All 12 months. Respects staleness.	~5-10min
`force`	All 12 months. Ignores staleness -- forces re-fetch of everything.	~5-10min

Response

{
    "scope": "today",
    "fetched": 24,
    "skipped": 0,
    "errors": 0,
    "updated": 3,
    "startedAt": "2026-04-23T14:00:00.000Z",
    "finishedAt": "2026-04-23T14:00:12.000Z"
}

Typical Use Cases

After initial deployment: scope=full to populate the database
After an extended outage: scope=force to refresh all data regardless of staleness
Investigating a specific park: scope=today to get fresh data quickly
Before peak season: scope=full to ensure complete coverage

Health Monitoring

Health Endpoint

curl http://localhost:3001/api/status

Response

{
    "status": "ok",
    "uptime": 86400,
    "parks": 24,
    "database": {
        "totalDays": 8760,
        "lastScrape": "2026-04-23T14:00:12.000Z"
    },
    "lastScrapeResult": {
        "scope": "today",
        "fetched": 24,
        "skipped": 0,
        "errors": 0,
        "updated": 3,
        "startedAt": "2026-04-23T14:00:00.000Z",
        "finishedAt": "2026-04-23T14:00:12.000Z"
    }
}

Key Metrics

Metric	Expected Value	Concern If
`status`	`"ok"`	Not `"ok"` (always `"ok"` currently, but confirms the endpoint is reachable)
`uptime`	Increasing	Drops to 0 (container restarted)
`database.totalDays`	8,000-9,000 (full year)	Much lower (scraping not running) or 0 (empty database)
`database.lastScrape`	Within the last hour (during operating season)	More than a few hours old (scheduler may be broken)
`lastScrapeResult.errors`	0	Consistently high (API may be blocking requests)

Suggested Alerting

Alert if database.lastScrape is more than 12 hours old during operating season (March-December)
Alert if lastScrapeResult.errors exceeds 5 on consecutive scrapes
Alert if the health endpoint is unreachable

Troubleshooting

No data showing in the calendar

Check if the backend is running:
```
docker compose logs backend --tail 50
```
Look for [backend] listening on http://localhost:3001.
Check if the database has data:
```
curl http://localhost:3001/api/status | jq .database.totalDays
```
If 0, trigger a manual scrape: curl -X POST http://localhost:3001/api/scrape/trigger?scope=full
Check the BACKEND_URL in the web container:
```
docker compose exec web env | grep BACKEND_URL
```
Should be http://backend:3001 (not localhost, which won't resolve inside Docker).

Ride counts not appearing on the home page

Ride counts only appear for parks that are currently within their operating window, as determined by isWithinOperatingWindow(). Outside of park hours, no rides are shown.
Queue-Times data is cached for 5 minutes. Recent park openings may take up to 5 minutes to appear.
Weather delay (blue badge) means the park is within its hours but all rides report closed -- this is expected during weather-related closures.
Verify the park has a Queue-Times mapping in lib/queue-times-map.ts.

Stale data / not updating

Check scheduler logs:
```
docker compose logs backend | grep scheduler
```
You should see periodic [scheduler] tier-X: scraping... messages.
Verify timezone:
```
docker compose exec backend date
```
Should match the TZ environment variable (America/New_York).
Check staleness threshold: Data within PARK_HOURS_STALENESS_HOURS (default 72h) is skipped by tiers 2-4. If you recently changed park data manually, it may not be re-fetched until the staleness window expires.

Force a refresh:

curl -X POST http://localhost:3001/api/scrape/trigger?scope=force

Rate limited by Six Flags API

Look for [rate-limited] messages in the backend logs:

docker compose logs backend | grep rate-limited

The client uses exponential backoff: 30s, 60s, 120s, then throws a RateLimitError and moves to the next park.
If rate limiting is persistent, increase PARK_HOURS_STALENESS_HOURS to reduce scrape frequency (e.g. 96 or 120).
The inter-park delay is hardcoded at 1000ms (500ms for the today tier) in backend/src/services/scraper.ts.

Wrong timezone / incorrect dates

getTodayLocal() uses the server's local time (set by TZ env var) with a 3 AM cutover. Before 3 AM, the system considers it "yesterday."
Each park has its own IANA timezone (stored in lib/parks.ts) used for operating window checks. The TZ env var only affects cron schedule timing and the "today" determination.

If dates seem off, check both TZ and the server's system clock:

docker compose exec backend date
docker compose exec backend node -e "console.log(new Date().toISOString())"

Database corruption

If the database becomes corrupted (unlikely with SQLite WAL mode, but possible after a hard crash):

Stop the backend: docker compose stop backend

Delete the database files from the volume:

docker compose run --rm backend rm -f /app/data/parks.db /app/data/parks.db-wal /app/data/parks.db-shm

Restart: docker compose start backend (auto-creates empty database)
Re-scrape: curl -X POST http://localhost:3001/api/scrape/trigger?scope=full

Log Reference

Prefix	Source	Meaning
`[backend]`	`index.ts`	Startup messages: DB initialized, server listening
`[scheduler]`	`scheduler.ts`	Cron job triggers with tier number
`[today]`	`scraper.ts`	Per-park results for the today tier (updated/skipped/error)
`[month]`	`scraper.ts`	Per-park-month results (open days count, rate limited, errors)
`[rate-limited]`	`sixflags.ts`	HTTP 429/503 with backoff timing and retry attempt count

Example log output:

[backend] database initialized
[scheduler] cron jobs registered
  tier-1: today        — hourly (Mar-Dec)
  tier-2: current month — every 6h
  tier-3: upcoming     — 3 AM + 3 PM
  tier-4: full year    — 3 AM daily
[backend] listening on http://localhost:3001
[scheduler] tier-1: scraping today @ 2026-04-23T14:00:00.000Z
[today] Great Adventure: updated (open 10am - 6pm)
[today] Cedar Point: updated (open 10am - 8pm)
[today] done: 24 fetched, 3 updated, 0 skipped, 0 errors

Performance Tuning

Aspect	Current Setting	Notes
SQLite WAL mode	Enabled	Allows concurrent reads during writes. No configuration needed.
In-memory cache	TtlCache (5 min TTL)	Bounded by park count -- at most ~72 entries (24 parks x 3 caches). Memory impact is negligible.
Staleness window	72 hours	Controls how often park data is re-fetched from the API. Lower values = fresher data but more API calls and higher rate-limit risk.
Inter-park delay	1000ms / 500ms	Hardcoded in `scraper.ts`. Provides respectful pacing against the Six Flags API.
ISR revalidation	60-300s per route	Controlled in Next.js fetch calls. Lower values = fresher pages but more backend requests.
Next.js standalone	Enabled	Produces a minimal server bundle without unused dependencies.

17 KiB Raw Blame History

Operations

Deployment Overview

Docker Images

Multi-Stage Build

Image Tags

Building Locally

Docker Compose

Environment Variables

Web Container

Backend Container

CI/CD Pipeline

Gitea Actions Workflow

Required Configuration

Setting Up CI/CD from Scratch

Initial Deployment Checklist

Updating

Backup and Restore

What to Back Up

Backup Methods

Restore

Note on Reproducibility

Scheduler Operations

Tiered Cron Schedule

Timezone Sensitivity

The 3 AM Switchover

Manual Scraping

Scope Options

Response

Typical Use Cases

Health Monitoring

Health Endpoint

Response

Key Metrics

Suggested Alerting

Troubleshooting

No data showing in the calendar

Ride counts not appearing on the home page

Stale data / not updating

Rate limited by Six Flags API

Wrong timezone / incorrect dates

Database corruption

Log Reference

Performance Tuning

17 KiB

Raw Blame History