Files
Provisioning/README.md
T
josh 2a0fbf6923
build-and-push / test (push) Successful in 35s
build-and-push / build-and-push (push) Successful in 56s
Remove unused hostname_prefix from server types and add duplicate checking
The HostnamePrefix field on ServerType was loaded from YAML but never used —
hostnames are user-provided. This removes the field and adds explicit
duplicate checks (hostname + MAC) with clear per-field error messages in
both the JSON API and web UI, backed by a new GetByHostname store method
with case-insensitive matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-14 10:50:16 -04:00

177 lines
5.7 KiB
Markdown

# Provisioning
Central control plane for a Proxmox homelab cluster. Handles PXE booting bare metal, unattended Proxmox installation, cluster join, and host lifecycle management.
## What it does
1. Operator registers a host (MAC address + server type)
2. Operator triggers "rebuild with Proxmox"
3. Provisioning generates an ephemeral SSH key pair for this rebuild
4. Host PXE boots → dnsmasq responds → iPXE chain-loads the Proxmox installer
5. Installer fetches a per-host answer file (TOML) with the ephemeral public key
6. Proxmox installs unattended → post-install webhook fires
7. Host reboots → first-boot script phones home with IP + hardware ID
8. Provisioning SSHes in using the ephemeral key → `pvecm add` joins the cluster
9. Ephemeral key removed from both the host and database
10. Host registered in Infrastructure → marked ready
Admin dashboard shows real-time progress via SSE.
## Host States
```
registered → pxe_ready → pxe_booted → installing → installed → first_boot → joining → ready
failed
```
## Deploy
### Prerequisites
- Docker + Docker Compose on the target host
- Host must be on the same network as the bare-metal nodes (for PXE/DHCP)
- Registry access to `gitea.thewrightserver.net`
No static SSH keys required — Provisioning generates ephemeral keys per rebuild automatically.
### Setup
```bash
# Log in to the container registry
docker login gitea.thewrightserver.net
# Run the install script (creates /opt/provisioning with config templates)
curl -sf https://gitea.thewrightserver.net/josh/Provisioning/raw/branch/main/deploy/install.sh | bash
```
Or manually:
```bash
mkdir -p /opt/provisioning && cd /opt/provisioning
curl -sfO https://gitea.thewrightserver.net/josh/Provisioning/raw/branch/main/docker-compose.yml
curl -sf https://gitea.thewrightserver.net/josh/Provisioning/raw/branch/main/deploy/provisioning.example.yaml -o provisioning.yaml
curl -sf https://gitea.thewrightserver.net/josh/Provisioning/raw/branch/main/deploy/server-types.example.yaml -o server-types.yaml
```
### Configure
Edit `provisioning.yaml`:
| Key | Description |
|-----|-------------|
| `server.public_url` | LAN-reachable URL (e.g. `http://192.168.1.100:8080`) |
| `pxe.interface` | NIC name on the host (e.g. `eth0`, `enp2s0`) |
| `pxe.subnet` | LAN CIDR for proxy-DHCP |
| `proxmox.existing_node` | IP of any current cluster member |
| `proxmox.join_fingerprint` | From `pvecm status` on an existing node |
| `credentials.root_password_hash` | Generate with `mkpasswd -m sha-512` |
| `infrastructure.base_url` | URL of the Infrastructure service |
| `infrastructure.server_type_map` | Maps local type keys to Infrastructure IDs |
Edit `server-types.yaml` with your actual hardware types:
```yaml
server_types:
minisforum-ms-01:
display_name: "Minisforum MS-01"
boot_disk: "/dev/nvme0n1"
management_nic: "enp2s0"
gpu: false
minisforum-um790:
display_name: "Minisforum UM790 Pro"
boot_disk: "/dev/nvme0n1"
management_nic: "enp1s0"
gpu: true
```
### Run
```bash
docker compose up -d
```
Dashboard at `http://<host>:8080`.
### Update
```bash
docker compose pull
docker compose up -d
```
## API
### Host Management
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/hosts` | List all hosts |
| GET | `/api/hosts/{id}` | Get host details |
| POST | `/api/hosts` | Register host (`hostname`, `mac`, `server_type`) |
| DELETE | `/api/hosts/{id}` | Remove host |
| POST | `/api/hosts/{id}/rebuild` | Start rebuild operation |
### Boot Flow (called by PXE-booting hosts)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/ipxe/{mac}` | iPXE boot script |
| POST | `/api/boot/answer` | Proxmox answer file (TOML) |
| POST | `/api/hosts/{id}/installed` | Post-install webhook |
| GET | `/api/hosts/{id}/first-boot-script` | First-boot shell script |
| POST | `/api/hosts/{id}/phone-home` | First-boot reports IP + hardware ID |
### Dashboard
| Method | Path | Description |
|--------|------|-------------|
| GET | `/` | Host grid with live state tiles |
| GET | `/hosts/{id}` | Host detail + operation history |
| GET | `/events` | SSE stream |
## Development
```bash
# Run tests
go test ./...
# Run locally (PXE disabled)
cp deploy/provisioning.example.yaml provisioning.yaml
cp deploy/server-types.example.yaml server-types.yaml
# Edit provisioning.yaml: set pxe.enabled=false, infrastructure.base_url=""
go run ./cmd/provisioning -config provisioning.yaml
# Build binary
make build
# Build Docker image locally
make docker
```
## Architecture
```
cmd/provisioning/ Entry point, wiring, shutdown
internal/
config/ YAML config + hot-reloaded server types (fsnotify)
db/ SQLite (WAL mode, embedded migrations)
model/ Domain types (Host, Operation, Image, ServerType)
store/ SQL stores (hosts, operations, locks, images)
statemachine/ Table-driven host state machine
events/ SSE fan-out hub
pxe/ dnsmasq supervisor, iPXE scripts, answer files, first-boot
orchestrator/ Lifecycle driver (ephemeral keys, cluster join, infra registration)
infra/ Infrastructure API client
api/ HTTP handlers (JSON API + HTML dashboard)
httpserver/ chi router assembly
web/ Embedded static assets (CSS, JS)
```
## CI/CD
Gitea Actions workflow (`.gitea/workflows/build.yml`):
- Runs `go test` and `go vet` on every push to main
- Builds Docker image and pushes to `gitea.thewrightserver.net/josh/provisioning:latest`