Files
goddard bfd6771a9a
admin-web / build (push) Successful in 22s
backend / test (push) Failing after 52s
mistpipe / test (push) Successful in 10s
admin-web / build-and-push (push) Failing after 5s
backend / build-and-push (push) Has been skipped
Initial Mist scaffold
Successor to the Josh Steam prototypes. Single-VM Docker Compose stack with
the load-bearing core/ logic ported from JoshSteam CDN with bug fixes.

Contents:
- backend/  FastAPI + Celery (same image, two entrypoints)
            core/  hdiff, librsync, chain_replay, manifest, compression,
                   discord, steam, unrealpak, paths
            api/   auth, catalog, admin, builds (skeletons) + downloads (real)
            worker/  Celery factory replacing the missing prototype Tasks/__init__.py
            db/    SQLAlchemy models + Alembic initial migration
- admin-web/  SvelteKit + Tailwind skeleton
- client/    Tauri 2 + Svelte skeleton (Mist placeholder UI)
- mistpipe/  click-based admin CLI with subcommand stubs
- docs/      ARCHITECTURE, DECISIONS (9 ADRs), RUNBOOK
- docker-compose.yml + dev overlay + .github/workflows

Bugs fixed during port:
- Routes/download.py:2 stray backslash on import line
- Utils/celery.py inspect.reserved() missing parens + double active() typo
- Hardcoded OneDrive/Desktop paths replaced with pydantic-settings config
- Discord webhook URL + RabbitMQ password moved to env vars
- Missing Tasks/__init__.py reconstructed as worker/__init__.py

Out of scope for this commit: route bodies, UI screens, mistpipe subcommand
bodies, real image builds.
2026-06-07 19:39:25 -04:00

7.0 KiB

Mist — Decisions Log

This file is an append-only log of significant decisions, in lightweight ADR (Architecture Decision Record) format. The goal is that future-you (or a contributor) can reconstruct why a choice was made, not just what was chosen.

Format

Each entry:

## NNNN — Short title  (YYYY-MM-DD)

**Status:** Accepted | Superseded | Deprecated
**Context:** What problem were we solving? What forces were at play?
**Decision:** What did we decide?
**Consequences:** What does this make easier? Harder?
**Alternatives considered:** What else we looked at and why we passed.

0001 — Project named "Mist" (supersedes "Josh Steam") (2026-06-07)

Status: Accepted Context: The original prototype was named "Josh Steam" because it was a personal project for the author and his friends. The rebuild is a real product (private but real) and benefits from a name that travels. Decision: Project name is Mist. CLI is mistpipe (homage to Steam's SteamPipe). Docker images namespaced mist-*. Domain pattern *.mist.example in docs. Consequences: All references to "Josh Steam" or "joshsteamctl" in any new code/docs must use the new name. Existing prototypes at Josh Steam/ on disk stay untouched as historical reference. Alternatives considered: Keep "Josh Steam". Rejected — uncomfortable to share, and the name doesn't say what it is.


0002 — Single-VM Docker Compose instead of Kubernetes (2026-06-07)

Status: Accepted Context: Original draft of the architecture proposed 7 microservices on a multi-node k3s/rke2 cluster with ArgoCD GitOps, Longhorn storage, MetalLB load balancer, and the Tailscale operator. The framing was "use this as an excuse to learn k8s and microservices." Decision: Run the backend as Docker Compose on a single Proxmox VM. Six containers total: api, worker, admin-web, postgres, redis, rabbitmq. Stateful services share the same compose stack with named volumes. NAS mounted via NFS. Consequences: Massively less operational complexity. Deploy is docker compose pull && up -d over SSH. No service mesh, no ingress controller, no GitOps tooling to learn before the product runs. The project itself (delta-patching, content distribution) is already complex enough; deployment shouldn't compound it. Trade-off: less k8s/microservices résumé padding. Alternatives considered:

  • Multi-node k8s + GitOps + 7 microservices. Rejected — adds learning surface unrelated to the actual problem and is wildly oversized for ~10 users.
  • Modular monolith on bare metal (no containers). Rejected — losing the reproducibility / portability of containers isn't worth the marginal simplicity.

0003 — Monorepo across services (2026-06-07)

Status: Accepted Context: Backend, worker, admin-web, client, and CLI all evolve together. Sharing types/contracts is easier when they share a repo. Decision: Single git repo with one top-level folder per deployable. Backend and worker share a Python package (backend/src/mist/) and run as different entrypoints of the same Docker image. Consequences: One CI workflow per artifact, but a single source of truth for the system. Refactors that cross boundaries are atomic. Alternatives considered: Polyrepo. Rejected — friend-scale doesn't justify the coordination overhead.


0004 — Modular monolith for backend (api + worker, same code) (2026-06-07)

Status: Accepted Context: Original plan split the backend into multiple services (identity, catalog, builds, downloads, client-bff, notifications). At ~10 users this is overkill. Decision: Single FastAPI app with internal modules per domain (api/auth.py, api/catalog.py, etc.). Celery worker shares the same Python package and Docker image; only the entrypoint differs. Consequences: Refactoring boundaries is a code-level concern, not an ops concern. If a domain genuinely outgrows the monolith later, extract it then. Alternatives considered: True microservices. Rejected per ADR 0002.


0005 — Linear versions only, no branches (2026-06-07)

Status: Accepted Context: Steam supports branches (stable / beta / internal). Useful for a real game publisher; overkill here. Decision: Versions form a linear ordered list per game. No branches in MVP. Consequences: Catalog data model is simpler (just ordinal on Version). Direct/indirect update routing logic is unchanged from prototype. If we ever want betas, we add a branch column and migrate. Alternatives considered: Steam-style branches. Rejected for MVP — no current need.


0006 — Public-by-default catalog with an is_private flag (2026-06-07)

Status: Accepted Context: Real entitlements ("Tim owns Game X, Tom doesn't") add an entitlements service. Friend-scale doesn't justify it. Decision: Single boolean is_private on Game. Public games are visible to anyone logged in. Private games are admin-only (future: explicit grants). Consequences: No entitlements service. If we want per-user grants later, add a game_user_grants table without breaking anything. Alternatives considered: Full Steam-style ownership. Rejected as premature.


0007 — Tauri client (Rust shell + Svelte UI) (2026-06-07)

Status: Accepted Context: Original prototype client was PyQt5. Tauri is smaller, modern, builds tiny installers, and lets the UI be written in web tech. Decision: Client is a Tauri 2 app with Svelte UI inside. Consequences: Need to either port patch-application logic to Rust or ship a Python sidecar the Tauri shell shells out to. UI is web tech (good ecosystem). Installer is small. Alternatives considered: Keep PyQt5 (familiar but dated), Electron (huge install), web-only (loses native install/launch).


0008 — Username + password auth, admin-provisioned (2026-06-07)

Status: Accepted Context: Options were Discord OAuth (natural fit since friends use Discord), self-hosted SSO (Authentik/Keycloak), or username/password. Decision: Username + password, argon2id hashing, admin provisions accounts manually via admin portal. Consequences: Simplest. No OAuth integration. No self-serve signup. Adding Discord OAuth later is straightforward. Alternatives considered: Discord OAuth (declined — adds dependency on Discord availability), magic-link email (needs SMTP).


0009 — Border-server reverse proxy + Tailscale backhaul, not cluster-side ingress (2026-06-07)

Status: Accepted Context: Friends shouldn't need to install Tailscale to use the service. Backend VM shouldn't be on the public internet. Decision: Public DNS → border server (public IP) → Nginx Proxy Manager terminates TLS via Let's Encrypt → Tailscale backhaul → VM. VM has no public exposure. Consequences: Friends use a regular browser/client over HTTPS. TLS lives at the border, not in the cluster. Cluster-side cert-manager not needed. Alternatives considered: Tailscale on every friend's machine (rejected — bad UX), public IP on the VM (rejected — security surface).