177 lines
5.6 KiB
Markdown
177 lines
5.6 KiB
Markdown
# Provisioning
|
|
|
|
Central control plane for a Proxmox homelab cluster. Handles PXE booting bare metal, unattended Proxmox installation, cluster join, and host lifecycle management.
|
|
|
|
## What it does
|
|
|
|
1. Operator registers a host (MAC address + server type)
|
|
2. Operator triggers "rebuild with Proxmox"
|
|
3. Provisioning generates an ephemeral SSH key pair for this rebuild
|
|
4. Host PXE boots → dnsmasq responds → iPXE chain-loads the Proxmox installer
|
|
5. Installer fetches a per-host answer file (TOML) with the ephemeral public key
|
|
6. Proxmox installs unattended → post-install webhook fires
|
|
7. Host reboots → first-boot script phones home with IP + hardware ID
|
|
8. Provisioning SSHes in using the ephemeral key → `pvecm add` joins the cluster
|
|
9. Ephemeral key removed from both the host and database
|
|
10. Host registered in Infrastructure → marked ready
|
|
|
|
Admin dashboard shows real-time progress via SSE.
|
|
|
|
## Host States
|
|
|
|
```
|
|
registered → pxe_ready → pxe_booted → installing → installed → first_boot → joining → ready
|
|
↓
|
|
failed
|
|
```
|
|
|
|
## Deploy
|
|
|
|
### Prerequisites
|
|
|
|
- Docker + Docker Compose on the target host
|
|
- Host must be on the same network as the bare-metal nodes (for PXE/DHCP)
|
|
- Registry access to `gitea.thewrightserver.net`
|
|
|
|
No static SSH keys required — Provisioning generates ephemeral keys per rebuild automatically.
|
|
|
|
### Setup
|
|
|
|
```bash
|
|
mkdir -p /opt/provisioning
|
|
cd /opt/provisioning
|
|
|
|
# Log in to the container registry
|
|
docker login gitea.thewrightserver.net
|
|
|
|
# Pull the compose file
|
|
curl -sO https://gitea.thewrightserver.net/josh/Provisioning/raw/branch/main/docker-compose.yml
|
|
|
|
# Pull example configs
|
|
curl -s https://gitea.thewrightserver.net/josh/Provisioning/raw/branch/main/deploy/provisioning.example.yaml -o provisioning.yaml
|
|
curl -s https://gitea.thewrightserver.net/josh/Provisioning/raw/branch/main/deploy/server-types.example.yaml -o server-types.yaml
|
|
```
|
|
|
|
### Configure
|
|
|
|
Edit `provisioning.yaml`:
|
|
|
|
| Key | Description |
|
|
|-----|-------------|
|
|
| `server.public_url` | LAN-reachable URL (e.g. `http://192.168.1.100:8080`) |
|
|
| `pxe.interface` | NIC name on the host (e.g. `eth0`, `enp2s0`) |
|
|
| `pxe.subnet` | LAN CIDR for proxy-DHCP |
|
|
| `proxmox.existing_node` | IP of any current cluster member |
|
|
| `proxmox.join_fingerprint` | From `pvecm status` on an existing node |
|
|
| `credentials.root_password_hash` | Generate with `mkpasswd -m sha-512` |
|
|
| `infrastructure.base_url` | URL of the Infrastructure service |
|
|
| `infrastructure.server_type_map` | Maps local type keys to Infrastructure IDs |
|
|
|
|
Edit `server-types.yaml` with your actual hardware types:
|
|
|
|
```yaml
|
|
server_types:
|
|
minisforum-ms-01:
|
|
display_name: "Minisforum MS-01"
|
|
boot_disk: "/dev/nvme0n1"
|
|
management_nic: "enp2s0"
|
|
gpu: false
|
|
hostname_prefix: "pve-ms"
|
|
|
|
minisforum-um790:
|
|
display_name: "Minisforum UM790 Pro"
|
|
boot_disk: "/dev/nvme0n1"
|
|
management_nic: "enp1s0"
|
|
gpu: true
|
|
hostname_prefix: "pve-um"
|
|
```
|
|
|
|
### Run
|
|
|
|
```bash
|
|
docker compose up -d
|
|
```
|
|
|
|
Dashboard at `http://<host>:8080`.
|
|
|
|
### Update
|
|
|
|
```bash
|
|
docker compose pull
|
|
docker compose up -d
|
|
```
|
|
|
|
## API
|
|
|
|
### Host Management
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| GET | `/api/hosts` | List all hosts |
|
|
| GET | `/api/hosts/{id}` | Get host details |
|
|
| POST | `/api/hosts` | Register host (`hostname`, `mac`, `server_type`) |
|
|
| DELETE | `/api/hosts/{id}` | Remove host |
|
|
| POST | `/api/hosts/{id}/rebuild` | Start rebuild operation |
|
|
|
|
### Boot Flow (called by PXE-booting hosts)
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| GET | `/ipxe/{mac}` | iPXE boot script |
|
|
| POST | `/api/boot/answer` | Proxmox answer file (TOML) |
|
|
| POST | `/api/hosts/{id}/installed` | Post-install webhook |
|
|
| GET | `/api/hosts/{id}/first-boot-script` | First-boot shell script |
|
|
| POST | `/api/hosts/{id}/phone-home` | First-boot reports IP + hardware ID |
|
|
|
|
### Dashboard
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| GET | `/` | Host grid with live state tiles |
|
|
| GET | `/hosts/{id}` | Host detail + operation history |
|
|
| GET | `/events` | SSE stream |
|
|
|
|
## Development
|
|
|
|
```bash
|
|
# Run tests
|
|
go test ./...
|
|
|
|
# Run locally (PXE disabled)
|
|
cp deploy/provisioning.example.yaml provisioning.yaml
|
|
cp deploy/server-types.example.yaml server-types.yaml
|
|
# Edit provisioning.yaml: set pxe.enabled=false, infrastructure.base_url=""
|
|
go run ./cmd/provisioning -config provisioning.yaml
|
|
|
|
# Build binary
|
|
make build
|
|
|
|
# Build Docker image locally
|
|
make docker
|
|
```
|
|
|
|
## Architecture
|
|
|
|
```
|
|
cmd/provisioning/ Entry point, wiring, shutdown
|
|
internal/
|
|
config/ YAML config + hot-reloaded server types (fsnotify)
|
|
db/ SQLite (WAL mode, embedded migrations)
|
|
model/ Domain types (Host, Operation, Image, ServerType)
|
|
store/ SQL stores (hosts, operations, locks, images)
|
|
statemachine/ Table-driven host state machine
|
|
events/ SSE fan-out hub
|
|
pxe/ dnsmasq supervisor, iPXE scripts, answer files, first-boot
|
|
orchestrator/ Lifecycle driver (ephemeral keys, cluster join, infra registration)
|
|
infra/ Infrastructure API client
|
|
api/ HTTP handlers (JSON API + HTML dashboard)
|
|
httpserver/ chi router assembly
|
|
web/ Embedded static assets (CSS, JS)
|
|
```
|
|
|
|
## CI/CD
|
|
|
|
Gitea Actions workflow (`.gitea/workflows/build.yml`):
|
|
- Runs `go test` and `go vet` on every push to main
|
|
- Builds Docker image and pushes to `gitea.thewrightserver.net/josh/provisioning:latest`
|