Successor to the Josh Steam prototypes. Single-VM Docker Compose stack with
the load-bearing core/ logic ported from JoshSteam CDN with bug fixes.
Contents:
- backend/ FastAPI + Celery (same image, two entrypoints)
core/ hdiff, librsync, chain_replay, manifest, compression,
discord, steam, unrealpak, paths
api/ auth, catalog, admin, builds (skeletons) + downloads (real)
worker/ Celery factory replacing the missing prototype Tasks/__init__.py
db/ SQLAlchemy models + Alembic initial migration
- admin-web/ SvelteKit + Tailwind skeleton
- client/ Tauri 2 + Svelte skeleton (Mist placeholder UI)
- mistpipe/ click-based admin CLI with subcommand stubs
- docs/ ARCHITECTURE, DECISIONS (9 ADRs), RUNBOOK
- docker-compose.yml + dev overlay + .github/workflows
Bugs fixed during port:
- Routes/download.py:2 stray backslash on import line
- Utils/celery.py inspect.reserved() missing parens + double active() typo
- Hardcoded OneDrive/Desktop paths replaced with pydantic-settings config
- Discord webhook URL + RabbitMQ password moved to env vars
- Missing Tasks/__init__.py reconstructed as worker/__init__.py
Out of scope for this commit: route bodies, UI screens, mistpipe subcommand
bodies, real image builds.
4.2 KiB
Mist — Operational Runbook
A short, dense reference for "what do I do when X happens." Fill in as we hit real situations.
Backend VM access
SSH: ssh mist@<vm-tailnet-name> (or <vm-tailnet-ip>).
Compose lives at: /opt/mist/ (TODO: confirm during deploy).
All docker compose commands run from that directory.
Normal operations
Deploy a new image
CI pushes images to GHCR on merge to main. To pull and restart:
cd /opt/mist
docker compose pull
docker compose up -d
docker compose ps
View logs
docker compose logs -f api
docker compose logs -f worker
docker compose logs --tail=200 api worker
Restart the stack
cd /opt/mist
docker compose down
docker compose up -d
Restart a single service
docker compose restart worker
Run a Celery task manually (debugging)
docker compose exec api python -c "from mist.worker.tasks import generate_direct_update; generate_direct_update.delay('Satisfactory', '1.0.0.0', '1.0.0.1')"
Failure scenarios
NAS is unreachable
Symptoms: worker tasks fail with FileNotFoundError for /mnt/nas/..., API /downloads/* returns 404 for non-cached files.
Action:
- Verify NAS reachability from the VM:
ls /mnt/nas/mist/games/ - If empty/error, NFS mount is broken. Check mount:
mount | grep nas - Remount:
sudo mount -a(assuming/etc/fstabhas the entry) - If still broken, log into NAS, verify it's serving NFS
- Stack will recover automatically once NAS is back; in-flight jobs will retry per Celery config
Postgres won't start
Symptoms: api container restarts in a loop, logs show connection refused to postgres.
Action:
docker compose logs postgres— look for the actual error- Common cause: out of disk space.
df -hon the VM. - If corrupted volume: stop stack, restore from last
pg_dump(see "Restore from backup")
Worker queue is backed up
Symptoms: Builds take forever, RabbitMQ UI (http://<vm>:15672/) shows growing queue depth.
Action:
- Check worker logs for stuck tasks
- Scale workers: edit
docker-compose.yml, setworker.deploy.replicas: 2,docker compose up -d - If a specific task is hanging, purge it:
docker compose exec worker celery -A mist.worker purge
Cache disk is full
Symptoms: Build jobs fail with OSError: no space left on device.
Action:
df -hto confirmdocker compose exec api python -m mist.core.paths --clear-cache(TODO: implement this maintenance task)- Or manually: stop stack,
rm -rf /var/lib/docker/volumes/mist_cache-vol/_data/*, restart
Stack won't come back up after VM reboot
Symptoms: SSH in after reboot, docker compose ps shows nothing or services are Exited.
Action:
- Verify Docker daemon:
systemctl status docker cd /opt/mist && docker compose up -d- If still failing, check
restart: unless-stoppedis set on all services indocker-compose.yml
Backups
What we back up
- Postgres (full dump) — daily
Mist/.env(passwords, secrets) — versioned outside this repodocker-compose.ymland any host-level config — in git
What we DON'T back up here
- Game files on NAS — NAS has its own backup story (assumed RAID + remote replication)
- Hot cache — regenerable from NAS
Take a Postgres backup
docker compose exec -T postgres pg_dump -U mist mist | zstd > /mnt/nas/mist/backups/pg-$(date +%F).sql.zst
Restore from a Postgres backup
docker compose stop api worker
zstd -d < /mnt/nas/mist/backups/pg-YYYY-MM-DD.sql.zst | docker compose exec -T postgres psql -U mist mist
docker compose start api worker
Provisioning a new friend account
(Until the admin portal supports this end-to-end.)
docker compose exec api python -m mist.scripts.create_user <username> <password> [--admin]
(TODO: implement that script.)
Resetting your admin password
docker compose exec api python -m mist.scripts.reset_password <username> <new-password>
(TODO: implement that script.)
Health checks (manual)
curl -s https://api.mist.example/healthz # expect {"ok": true}
curl -s https://api.mist.example/readyz # expect 200 if DB/Redis/RabbitMQ all reachable