feat: RCDB-backed roller coaster filter with fuzzy name matching
All checks were successful
Build and Deploy / Build & Push (push) Successful in 2m54s

- Add lib/park-meta.ts to manage data/park-meta.json (rcdb_id + coaster lists)
- Add lib/scrapers/rcdb.ts to scrape operating coaster names from RCDB park pages
- discover.ts now seeds park-meta.json with skeleton entries for all parks
- scrape.ts now refreshes RCDB coaster lists (30-day staleness) for parks with rcdb_id set
- fetchLiveRides() accepts a coasterNames Set; isCoaster uses normalize() on both sides
  to handle trademark symbols, 'THE ' prefixes, and punctuation differences between
  Queue-Times and RCDB names — applies correctly to both land rides and top-level rides
- Commit park-meta.json so it ships in the Docker image (fresh volumes get it automatically)
- Update .gitignore / .dockerignore to exclude only *.db files, not all of data/
- Dockerfile copies park-meta.json into image before VOLUME declaration
- README: document coaster filter setup and correct staleness window (72h not 7d)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-04 13:49:49 -04:00
parent 819e716197
commit 9700d0bd9a
11 changed files with 710 additions and 15 deletions

View File

@@ -31,6 +31,13 @@ The park detail page shows ride open/closed status using a two-tier approach:
2. **Schedule fallback (Six Flags API)** — the Six Flags operating-hours API drops the current day from its response once a park opens. When Queue-Times data is unavailable, the app falls back to the nearest upcoming date from the Six Flags schedule API as an approximation.
### Roller Coaster Filter
When live data is shown, a **Coasters only** toggle appears if roller coaster data has been populated for that park. Coaster lists are sourced from [RCDB](https://rcdb.com) and stored in `data/park-meta.json`. To populate them:
1. Open `data/park-meta.json` and set `rcdb_id` for each park to the numeric RCDB park ID (visible in the URL: `https://rcdb.com/4529.htm``4529`).
2. Run `npm run scrape` — coaster lists are fetched from RCDB and stored in the JSON file. They refresh automatically every 30 days on subsequent scrapes.
---
## Local Development
@@ -111,4 +118,6 @@ docker run --rm -v sixflagssupercalendar_park_data:/app/data \
## Data Refresh
The scraper skips any park + month combination scraped within the last 7 days. Run `npm run scrape` on a weekly schedule to keep data current. Parks or months not yet in the database show a `—` placeholder; parks with no open days in the displayed week are hidden from the calendar automatically.
The scraper skips any park + month already scraped within the last 72 hours. The nightly Docker scraper service handles this automatically. Parks or months not yet in the database show a `—` placeholder; parks with no open days in the displayed week are hidden from the calendar automatically.
Roller coaster lists (from RCDB) are refreshed every 30 days on each `npm run scrape` run, for parks with a configured `rcdb_id`.