Commit Graph

25 Commits

Author SHA1 Message Date
josh 44c358a89b Fix initrd loading with explicit iPXE name binding
build-and-push / test (push) Successful in 34s
build-and-push / build-and-push (push) Successful in 1m14s
iPXE needs --name on the initrd command and initrd=<name> on the
kernel line to properly pass the initrd to the kernel. Without this,
the kernel never receives the initrd, causing VFS mount failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-14 11:03:03 -04:00
josh 846b6847a5 Fix kernel panic by adding ramdisk_size to iPXE kernel params
build-and-push / test (push) Successful in 35s
build-and-push / build-and-push (push) Successful in 1m12s
The Proxmox initrd is too large for the default ramdisk allocation,
causing VFS to fail mounting root. Add ramdisk_size=16777216 (16GB)
along with rw, quiet, and splash=verbose for proper installer boot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-14 10:55:34 -04:00
josh c75a47d299 Add .claude/ directory and SQLite WAL files to .gitignore
build-and-push / test (push) Successful in 34s
build-and-push / build-and-push (push) Successful in 1m7s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-14 10:51:30 -04:00
josh 2a0fbf6923 Remove unused hostname_prefix from server types and add duplicate checking
build-and-push / test (push) Successful in 35s
build-and-push / build-and-push (push) Successful in 56s
The HostnamePrefix field on ServerType was loaded from YAML but never used —
hostnames are user-provided. This removes the field and adds explicit
duplicate checks (hostname + MAC) with clear per-field error messages in
both the JSON API and web UI, backed by a new GetByHostname store method
with case-insensitive matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-14 10:50:16 -04:00
josh 1317ff6369 Add job detail page with activity log and cancel support
build-and-push / test (push) Successful in 34s
build-and-push / build-and-push (push) Successful in 1m8s
Operations are now clickable from the host detail page, linking to
/ops/{id} which shows the operation info, host link, duration, and
activity log filtered to that operation. Active operations can be
cancelled, which transitions the host to failed and releases the lock.
SSE activity events now include operation_id for real-time filtering.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-14 10:37:18 -04:00
josh 5ff1cff7d4 Fix iPXE chainload loop by excluding iPXE from pxe-service
build-and-push / test (push) Successful in 37s
build-and-push / build-and-push (push) Successful in 1m12s
iPXE was stuck in a loop: boot iPXE -> DHCP -> get ipxe.0 again ->
boot iPXE -> repeat. Add tag:!ipxe to pxe-service directives so
iPXE clients get the HTTP script URL via dhcp-boot instead of being
served the bootloader again.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-14 10:23:49 -04:00
josh 78a20770dd Fix nil Activity store in test setup causing panic on rebuild
build-and-push / test (push) Successful in 34s
build-and-push / build-and-push (push) Successful in 1m7s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 23:32:50 -04:00
josh a6603b463f Add activity log system for provisioning lifecycle visibility
build-and-push / test (push) Failing after 32s
build-and-push / build-and-push (push) Has been skipped
Hosts stuck in states like pxe_ready had zero visibility into why.
This adds a persistent activity log that records every meaningful
step (state transitions, PXE events, cluster join stages, failures)
and surfaces it on the host detail page with live SSE updates.
Includes a stuck-detection warning banner when a host sits in
pxe_ready for >10 minutes with no iPXE request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 23:30:21 -04:00
josh c3a1cf99f9 Switch to pxe-service for proxy DHCP boot and restore host filtering
build-and-push / test (push) Successful in 37s
build-and-push / build-and-push (push) Successful in 1m14s
dhcp-boot alone does not send PXE vendor extensions (option 43) that
PXE clients need in proxy DHCP mode. Switch to pxe-service directives
for initial PXE boot, keep dhcp-boot only for iPXE chainloading.
Create .0 symlinks for pxe-service filename convention. Restore
dhcp-ignore=tag:!known filtering.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 23:04:29 -04:00
josh df78f881bb Remove dhcp-ignore filter to debug proxy DHCP non-response
build-and-push / test (push) Successful in 36s
build-and-push / build-and-push (push) Successful in 1m14s
dnsmasq sees PXE requests but never responds. Remove the known-host
filter to determine if tag matching is the issue or if the problem
is elsewhere in the proxy DHCP flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 22:35:29 -04:00
josh dfcf91c949 Fix dhcp-hostsfile to explicitly set known tag for PXE clients
build-and-push / test (push) Successful in 37s
build-and-push / build-and-push (push) Successful in 1m34s
Bare MACs in dhcp-hostsfile were not auto-setting the known tag in
proxy DHCP mode, causing dhcp-ignore=tag:!known to drop all requests.
Explicitly write set:known per host entry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 22:08:48 -04:00
josh ba5440a481 Fix dnsmasq not responding to PXE clients and seed iPXE binaries
build-and-push / test (push) Successful in 42s
build-and-push / build-and-push (push) Successful in 1m17s
Remove tag:known filter from dhcp-range — in proxy DHCP mode the tag
filter prevents responses. dhcp-ignore=tag:!known still filters
unknown hosts. Also copy ipxe.efi and undionly.kpxe from the system
ipxe package into the TFTP root at startup so clients can actually
download the bootloader.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 22:01:01 -04:00
josh 05bb242f50 Fix dnsmasq crash by creating tftp-root dir and using subnet config
build-and-push / test (push) Successful in 38s
build-and-push / build-and-push (push) Successful in 1m22s
dnsmasq exited with status 3 because the tftp-root directory didn't
exist at startup. Also replaced hardcoded 192.168.1.0 in dhcp-range
with the configured subnet value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 21:47:07 -04:00
josh 0bf1a62897 Redesign frontend with clean light theme and design system
build-and-push / test (push) Successful in 37s
build-and-push / build-and-push (push) Successful in 1m18s
Replace prototype dark theme with a professional light-theme design
using Outfit (UI) and IBM Plex Mono (data) fonts, navy topbar, white
card surfaces, and a full CSS variable system for colors, shadows,
spacing, and radii. Add LED status indicators, panel components,
and structured tile layout with header/meta/footer sections.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-10 22:00:53 -04:00
josh 443a3db9e1 Add upload progress bar with SSE extraction status
build-and-push / test (push) Successful in 40s
build-and-push / build-and-push (push) Successful in 1m8s
ISO uploads now show a progress bar during file transfer (via XHR
upload.onprogress) and real-time extraction status (via SSE events
through the existing Hub). Falls back to plain form POST if JS is
disabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-10 19:16:19 -04:00
josh 4774600040 Add boot image management with ISO extraction and serving
build-and-push / test (push) Successful in 34s
build-and-push / build-and-push (push) Successful in 1m7s
Upload Proxmox ISOs via API or dashboard UI, extract kernel+initrd
using pure-Go iso9660 library, store on disk, and serve over HTTP
for PXE booting. Dynamic kernel/initrd filenames per image replace
the previous hardcoded paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-09 21:26:31 -04:00
josh da2d72e95d Fix static file serving by using fs.Sub on embed FS
build-and-push / test (push) Successful in 34s
build-and-push / build-and-push (push) Successful in 1m5s
The //go:embed static directive nests files under static/, so after
StripPrefix removes /static/ from the URL, the FileServer couldn't find
the files. Use fs.Sub to root the FS at the static/ subdirectory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-09 20:54:15 -04:00
josh 76b9f64141 Remove keys volume mount and add install script
build-and-push / test (push) Successful in 36s
build-and-push / build-and-push (push) Successful in 1m8s
- Remove /etc/provisioning/keys mount (ephemeral keys are in-memory now)
- Remove /etc/provisioning VOLUME from Dockerfile
- Add deploy/install.sh that creates config files before docker compose up,
  preventing Docker from creating directories in place of missing bind mounts
- Update README with install script usage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-09 20:48:35 -04:00
josh 4dcd1f943b Disable Go module cache in CI to avoid 4m+ timeout
build-and-push / test (push) Successful in 41s
build-and-push / build-and-push (push) Successful in 1m22s
Gitea's cache server is unreachable, causing setup-go to block on a
failed cache restore. Disable it since the Docker build layer caches
dependencies independently.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-03 21:44:12 -04:00
josh a12755522f Fix .gitignore excluding cmd/provisioning directory
build-and-push / build-and-push (push) Has been cancelled
build-and-push / test (push) Has been cancelled
The pattern `provisioning` matched both the binary and the directory.
Use `/provisioning` to only match at the repo root.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-03 21:39:20 -04:00
josh ca6e8661fc Update README with full API reference and ephemeral key docs
build-and-push / build-and-push (push) Has been cancelled
build-and-push / test (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-03 21:17:37 -04:00
josh b23ef64ee1 Use ephemeral SSH keys per rebuild instead of static config keys
build-and-push / test (push) Successful in 9m57s
build-and-push / build-and-push (push) Has been cancelled
Generate a fresh ed25519 key pair at rebuild time, inject the public key
into the Proxmox answer file, use the private key for cluster join over
SSH, then remove the key from both the remote host and the database.
This eliminates the need to manage static SSH keys in config/secrets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-03 21:09:22 -04:00
josh aec31b9f8b Add README with deploy instructions
build-and-push / test (push) Successful in 9m57s
build-and-push / build-and-push (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-03 20:59:20 -04:00
josh c06ce6e8bb Add CI/CD pipeline and docker-compose for deployment
build-and-push / test (push) Successful in 10m25s
build-and-push / build-and-push (push) Failing after 33s
- Gitea Actions workflow: test → build → push to container registry
- docker-compose.yml for host deployment (host network for PXE)
- Update example config to use container paths (/data, /etc/provisioning)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-03 20:58:15 -04:00
josh bda568b25c Initial implementation: host lifecycle + PXE + admin dashboard
Go service for Proxmox homelab cluster provisioning. Handles PXE boot,
Proxmox autoinstall (answer file generation), cluster join via SSH,
and Infrastructure API registration.

- Host state machine (registered → pxe_ready → installing → ready)
- dnsmasq supervisor with MAC-based allowlist
- iPXE script and Proxmox answer file generation
- First-boot phone-home → cluster join → infra registration
- Operation locking with expiry (409 on conflict)
- SSE event hub for real-time dashboard updates
- Admin dashboard (host grid, detail, registration form)
- Config-driven server types with hot-reload
- Docker deployment (multi-stage fat image)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-03 20:55:14 -04:00