35 KiB
@atlilith — Infrastructure Design
Status: Design phase Date: 2026-05-16 Companion to: DESIGN.md
1. Hosts at a glance
| Host | Type | Role | Network | OS |
|---|---|---|---|---|
| plum | Mac mini (Apple Silicon) | Workstation + macOS-only peers | LAN: plum.lan |
macOS |
| apricot | Linux box (home lab) | Dev environment + LAN-only services | LAN: 10.0.0.13 |
Bluefin/Bootc |
| black | Linux box (home lab) | LAN tooling host (Forgejo, Verdaccio, ai-engine worker, dev DBs, mac-sync DB) | LAN-only: 10.0.0.11 via WireGuard mesh (no public IP) |
Linux (Bluefin) |
| vps-0 | Hetzner VPS (alias quinn-vps) |
Public app tier + cache | Public IP, reaches black via SSH reverse tunnel | Linux |
Why this split (verified from users/transquinnftw/app.manifest.yaml)
- plum runs the macOS-only peer services (mail-sync = Proton Bridge wrapper, mac-sync = iMessage bidirectional sync). Cannot move off Mac.
- apricot is the only writer for the codebase (auto-commit-service ACS gates concurrent edits). Hosts dev DBs + dev frontends.
- black is the data + LAN tooling host. Runs
platform.api(V3 — historicallyplatform.api), the gateway/data layer for all authenticated reads/writes, fronted by its own Postgres (currently :5432 + :25435 dev tier + :25436quinn_macsyncfor the mac-sync ingest). Also hostsforge.black.lan(Forgejo),npm.black.lan(Verdaccio) — both routed via ahost-nginxDocker container alongside the system nginx — plusquinn-ai-auto-respond.service(cut over from apricot 2026-05-15),marketplace-api,quinn-mail-autoresponder/-notifier/-digestworkers, andquinn-m-orchestrator-tunnel.servicewhich maintains the SSH link from black → vps-0 for the public-info cache. Black is the data crown jewel. - vps-0 (
quinn-vps,89.127.233.145) is the public-internet face: production web UIs (quinn.{www,sso,my,m,ai,admin,data,vip} frontends + the org/brand sites cocotte.maison, sansonnet.maison, adulttherapytour.com siblings) and a cache server for the public-information subset ofplatform.api(the canonical name; wasplatform.apiin V2). Also hostsdocker-mailserverfortransquinnftw.comat/opt/quinn-mailserverand the defensive-coms nginx redirects for.com → .maisonaliases. Private/authenticated data does not live on vps-0 — it sits behindplatform.apion black. V2 and V3 are expected to run side by side (perDESIGN.md §8 Phase 5.1— V3 picks port ranges that don't collide with V2). V2's existingquinn-*-apisystemd units and local Postgres on vps-0 keep serving Quinn's traffic; V3 stands up its parallelplatform.apion black for new Providers and gradually attracts surfaces. V2 is decommissioned only when V3 hits parity (DESIGN.md §11 Success Criteria #1, #6). - (No separate
vps-quinnhost — that name in the manifest is just an alias for vps-0.)
2. Topology — ASCII
┌──────────────────────────────────────┐
│ PUBLIC INTERNET │
└────────────┬─────────────────────────┘
│
┌──────────────┼──────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────┐ ┌──────────────────────┐
│ atlilith.com │ │ quinn.* │ │ cocotte.maison │
│ (marketing, │ │ (Quinn's │ │ sansonnet.maison │
│ SSO root, │ │ instance, │ │ adulttherapytour │
│ waitlist) │ │ personal) │ │ ftw.pw, etc. │
└────────┬─────────┘ └──────┬───────┘ └─────────┬────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────┐
│ CADDY / EDGE ROUTER (per host) │
│ TLS termination · domain → service routing · waf │
└─────┬──────────────────────┬────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────┐ ┌────────────────────────────────────┐
│ VPS-0 (public) │ │ BLACK (10.0.0.11) — LAN prod core │
│ "Quinn's app + cache" │ │ "AUTHORITATIVE PRODUCTION DBs" │
│ │ │ │
│ App tier: │ │ Edge: │
│ ┌─────────────────────┐ │ │ ┌────────────────────────────────┐ │
│ │ quinn.www, platform.api│ │ │ │ atlilith.www (public) │ │
│ │ quinn.sso, quinn.my │ │ │ │ waitlist-api │ │
│ │ quinn.m, quinn.ai │ │ │ │ docker-mailserver + Rspamd │ │
│ │ quinn.admin, vip │ │ │ │ (inbound SMTP for atlilith) │ │
│ │ quinn.data SPA │ │ │ └────────────────────────────────┘ │
│ │ mail-autoresponder │ │ │ Authoritative DBs: │
│ │ ai-engine, m-sync │ │ │ ┌────────────────────────────────┐ │
│ └──────────┬──────────┘ │ │ │ pg :25435 quinn.db (unified) │ │
│ │ │ │ │ ← vps-0 reaches via │ │
│ Local cache tier: │ │ │ SSH -R 25435 reverse tunnel│ │
│ ┌─────────────────────┐ │ │ │ pg :25433 quinn.m-db │ │
│ │ timescaledb :25434 │ │ │ │ (messenger, imessage-sync) │ │
│ │ (analytics writes)│ │ │ │ pg :25436 mac-sync icloud │ │
│ │ redis :26379 │ │ │ │ (read-only ingest mirror) │ │
│ │ (queue + sessions)│ │◀─┐│ └────────────────────────────────┘ │
│ │ minio (object hot) │ │ ││ Object/cold: │
│ └─────────────────────┘ │ ││ ┌────────────────────────────────┐ │
└─────────┬───────────────┘ ││ │ minio :9000 (cold/backup tier) │ │
│ SSH -R tunnel ││ └────────────────────────────────┘ │
│ to black:25435 ││ Workers: │
│ + black:25433 ││ ┌────────────────────────────────┐ │
└──────────────────┘│ │ quinn.hotel-scout (systemd) │ │
│ │ minio cold replication target │ │
│ └────────────────────────────────┘ │
└─────┬──────────────────────────────┘
│ LAN
┌───────────────┴──────────────┬────────────┐
▼ ▼ ▼
┌─────────────────────────────┐ ┌────────────────┐ ┌─────────────────┐
│ APRICOT (10.0.0.13) │ │ PLUM (Mac) │ │ Forgejo / git │
│ "dev + LAN tools" │ │ "macOS peers" │ │ (on BLACK, │
│ │ │ │ │ forge.black.lan)│
│ │ │ │ └─────────────────┘
│ Dev DBs (full local stack): │ │ mail-sync :4444│
│ pg :25435 quinn.db (dev) │ │ Proton Bridge │
│ pg :25433 quinn.m-db (dev) │ │ mac-sync :3201 │
│ pg :25436 mac-sync (dev) │ │ iMessage sync │
│ timescaledb :25434 (dev) │ │ @ml/knowledge- │
│ redis :26379 (dev) │ │ platform │
│ minio :9000 (dev) │ │ (Crystal TUI) │
│ mailpit :1025/:8025 (dev) │ │ @agents/* MCP │
│ │ │ servers │
│ Dev frontends/APIs: │ └────────────────┘
│ *.apricot.lan (Caddy) │
│ │
│ ACS (auto-commit-service) │
│ Forgejo (self-host git) │
└─────────────────────────────┘
3. Databases — who lives where
Authoritative production DBs — black (LAN, 10.0.0.11)
┌──────────────────────────────────────────────────────────────┐
│ black (AUTHORITATIVE PRODUCTION DBs) │
│ │
│ postgres:25435 ─── platform.db (was quinn.db, unified) │
│ ├── users, orgs, org_members ← tenancy core │
│ ├── providers, profiles, attributes ← profile system │
│ ├── bookings, payments, reviews ← marketplace │
│ ├── client_intel, trust_records ← safety │
│ └── audit_log │
│ ▲ vps-0 apps reach this via SSH -R 25435 reverse tunnel │
│ │
│ postgres:25433 ─── messenger.db (iMessage threads) │
│ ├── threads, messages, contacts │
│ └── send_queue (writes from m-sync via tunnel) │
│ │
│ postgres:25436 ─── mac-sync.db (raw iCloud, read-only) │
│ └── (mac-sync peer on plum is the writer; mirrored │
│ here for read access from vps-0/black) │
│ │
│ minio:9000 ─── object storage (cold tier, photo backup)│
│ docker-mailserver ─ inbound SMTP for atlilith.com │
│ systemd workers ─── quinn.hotel-scout (hourly timer) │
└──────────────────────────────────────────────────────────────┘
Public app tier + local cache — vps-0
┌──────────────────────────────────────────────────────────────┐
│ vps-0 (Public app tier — DBs are CACHES, not authoritative)│
│ │
│ timescaledb:25434 ── analytics.db (org-analytics events) │
│ ├── visitor_events (org_id partitioned, hot writes) │
│ ├── funnels, conversions │
│ └── per-org rollups (continuous aggregates) │
│ ▼ Cold rollups periodically flushed to black │
│ │
│ redis:26379 ──────── cache + queue │
│ ├── analytics ingestion queue (before flush to ts-db) │
│ ├── BullMQ jobs (queue-worker feature) │
│ ├── session cache (SSO JWT validation) │
│ └── HTTP response cache for hot reads │
│ │
│ minio:9000 ──────── object storage (hot tier) │
│ └── replicates → black:9000 (cold) │
│ │
│ App processes for quinn.* (no persistent state of their own)│
└──────────────────────────────────────────────────────────────┘
Why this split (vps-0 cache, black authoritative):
- vps-0 is replaceable — if it dies, spin up a new VPS, redeploy from git, point DNS. Caches rebuild from black.
- black is the data crown jewel — kept on a controlled LAN host, harder to attack from public internet.
- vps-0 → black uses persistent SSH reverse tunnel (
-R 25435:localhost:25435) initiated from black, so vps-0 can't be a pivot back to LAN if compromised.
Dev DB tier (apricot)
┌──────────────────────────────────────────────────────────────┐
│ apricot (Dev — full local stack) │
│ │
│ postgres:25435 ─── platform.db (dev, seeded) │
│ postgres:25433 ─── messenger.db (dev) │
│ postgres:25436 ─── mac-sync.db (dev, mirror of plum's) │
│ timescaledb:25434 ─ analytics.db (dev) │
│ redis:26379 ─── queue + cache (dev) │
│ minio:9000 ─── object storage (dev) │
│ mailpit:1025/8025 ─ dev SMTP capture (visible UI) │
└──────────────────────────────────────────────────────────────┘
Plum-resident state (NOT in any pg)
┌──────────────────────────────────────────────────────────────┐
│ plum (macOS-only) │
│ │
│ ~/.local/share/mail-sync/mail-sync.db ── SQLite send Q │
│ ~/.local/share/mac-sync/mac-sync.db ── SQLite ingest Q │
│ ~/.local/share/knowledge-platform/*.db ── Crystal TUI db │
│ │
│ (These are local-only queues. Source of truth eventually │
│ lands in vps-quinn pg via HTTP push.) │
└──────────────────────────────────────────────────────────────┘
4. Service distribution by host
plum — macOS-only peers
| Service | Port | Reason it's here |
|---|---|---|
mail-sync |
4444 | Wraps Proton Bridge SMTP (Mac-only app) |
mac-sync server |
3201 | Reads iMessage from macOS APIs; ad-hoc bun process. Lifecycle: see ~/Code/@applications/@mac-sync/ |
@ml/knowledge-platform (incl. Crystal) |
varies | Already runs here; GPU work if any |
@applications/@agents/* |
varies | Claude SDK agents (assistant, companion, prospector, voice, etc.) |
apricot — dev box & LAN tooling
| Service | Port | Reason it's here |
|---|---|---|
All @features/* dev servers |
3020-3039 | Bun + Vite dev mode |
All @apps/* dev frontends |
5110-5200 | Vite HMR |
| Postgres (dev) | 25433-25436 | Local dev DB |
| TimescaleDB (dev) | 25434 | Analytics dev |
| Redis (dev) | 26379 | Queue dev |
| MinIO (dev) | 9000/9001 | S3 dev |
| Mailpit (dev) | 1025/8025 | SMTP capture |
| ACS (auto-commit-service) | — | Serializes git commits (apricot is sole writer) |
| Forgejo | — | Moved: lives on forge.black.lan (black), not apricot |
black — LAN tooling, dev DBs, worker host
| Service | Port / Address | Notes |
|---|---|---|
| Forgejo | forge.black.lan:2222 (ssh), :80/:443 (HTTP) |
Self-hosted git, single source of truth for repos |
| Verdaccio | npm.black.lan (canonical) |
Private npm registry for @lilith/* packages |
host-nginx (Docker, nginx:alpine, host networking) |
80/443 | Owns all LAN hostname routing — Verdaccio/Forgejo/etc. Config at /bigdisk/nginx/nginx.conf. |
| System nginx 1.24.0 (Ubuntu) | varies | Only handles next.* staging apps, not the LAN registry routing |
quinn-ai-auto-respond.service (systemd) |
— | TS draft-pipeline calling apricot:8210 model-boss; cut over from apricot 2026-05-15 |
| Postgres (dev) | :25435 |
Dev tier used by apricot for some flows |
| Postgres (mac-sync) | :25436 quinn_macsync (was quinn_icloud, renamed 2026-05-17) |
Schema is macsync.*. Plum's mac-sync server is the writer. |
| dnsmasq | :53 |
Wildcard DNS for *.black.lan and *.apricot.lan (migrated off .local 2026-05-16) |
| MinIO (cold) | 9000 | Backup target (planned) |
vps-0 — Quinn's public app tier + local cache
All quinn.* deployed domains (apps) + local cache layer (TimescaleDB, Redis, MinIO-hot):
| Domain | Service | Port |
|---|---|---|
quinn.www |
Provider website (transquinnftw.com) | 5120→443 |
platform.api |
API gateway (Hono) | 3030→internal |
quinn.sso |
SSO + device-link | 3025→443 |
quinn.my |
Provider portal | 5174→443 |
quinn.m |
Messenger UI | 5175→443 |
quinn.ai |
AI assistant | 5176→443 |
quinn.admin |
Admin panel | 5121→443 |
quinn.data |
Analytics dashboard | 5111→443 |
quinn.vip |
VIP messaging | 5178→443 |
quinn.ai-engine |
LLM inference worker | (internal) |
quinn.mail-autoresponder |
Auto-respond engine | (internal) |
quinn.hotel-scout |
Tour booking automation | (internal) |
quinn.price-watcher |
Price monitoring | (internal) |
quinn.m-orchestrator |
Background worker | 3803 (health) |
quinn.my-orchestrator |
Background worker | (health) |
| TimescaleDB (quinn.analytics.db) | 25434 | Analytics writes hot path |
| Redis (quinn.analytics.redis) | 26379 | Queue, BullMQ, session cache |
| MinIO (hot) | 9000 | Active object storage; replicates to black |
| quinn.www, platform.api, quinn.sso, quinn.my, quinn.m, quinn.ai, quinn.admin, quinn.data, quinn.vip | various | All app processes |
| quinn.mail-autoresponder, quinn.m-sync, quinn.m-api | 3028/3030/3100/3105 | Background workers + APIs |
| pgBouncer | :6432 | Transaction-mode pooler in front of vps-0's prod Postgres (apricot dev for quinn.my tunnels here) |
| Postgres (prod, in current practice) | :5432 (behind pgBouncer) | Production data for quinn.* apps. The V3 design (this doc) wants this on black; not yet migrated. |
| docker-mailserver for transquinnftw.com | 25/465/587/993 | At /opt/quinn-mailserver. (Was once planned on black; reality is vps-0.) |
| cocotte.maison + sansonnet.maison brand sites | 443 (LE) | Live 2026-05-17. Defensive .com aliases (cocottehouse.com, maisonsansonnet.com) handled by defensive-coms nginx config — 301 → canonical .maison via transquinnftw.com cert SANs. |
NOTE: quinn.ai-engine is not on vps-0 — it runs as quinn-ai-auto-respond.service on black (see Section 4 black table).
5. Network & routing
TLS termination
- vps-0 → Caddy → quinn.* services. Caddy auto-issues Let's Encrypt certs per subdomain.
- black → Caddy → atlilith.com, www.atlilith.com, brand sites (cocotte.maison, sansonnet.maison) for public-facing brand sites.
- apricot → local Caddy →
*.apricot.lanfor dev. Unified mkcert wildcard cert atinfrastructure/certs/_wildcard.apricot.lan.{crt,key}with 5 SAN patterns covers every dev hostname (2026-05-17). Caddy(local_tls)snippet imported by every site block — adding a new dev subdomain that fits an existing SAN pattern needs zero cert work.
Inter-host links
- plum ↔ apricot/black: LAN. mail-sync called via
MAIL_SYNC_BASE_URL=http://plum.lan:4444 (mail-sync); mac-sync at plum.lan:3201. mac-sync writes to apricot/black PG. - vps-0 → black: SSH reverse tunnel initiated from black (
ssh -R 25435:localhost:25435 ... -R 25433:localhost:25433 vps-0). Apps on vps-0 connect tolocalhost:25435and reach black's PG. Tunnel-initiator-from-LAN means vps-0 cannot pivot back into LAN if compromised. - black ↔ apricot: LAN; restic backups push from black → apricot mirror.
DNS
- atlilith.com → black (LAN edge via public IP) for marketing/SSO root
- quinn. domains* → vps-0 (Hetzner public IP) for Quinn's app instance
- {provider}. domains* → future per-provider VPS (Phase 9+ when onboarding a 2nd provider)
- *.apricot.lan / *.black.lan → dnsmasq on black (wildcard, migrated from
.localon 2026-05-16). Plum is reached asplum.lanvia mDNS / direct A.
6. Per-tenant data isolation strategy
V3 must handle multiple providers + multiple orgs without cross-tenant leakage. Two options:
Option A — Row-level tenancy (single DB, recommended for V3 launch)
- One
platform.dbshared by all tenants - Every queryable row has
user_id(Person owner) ororg_id(Org owner) - API layer enforces
WHERE user_id = $session.user_id OR org_id IN (SELECT org_id FROM org_members WHERE user_id = $session.user_id) - Postgres RLS (row-level security) policies as defense-in-depth
Option B — DB-per-tenant (defer, only if scale demands)
- Separate Postgres DB per Org (or per Person at large scale)
- Better blast radius isolation, harder cross-tenant analytics
- Not needed until ~100+ providers
V3 ships with Option A. Migration to Option B (if ever) is a future Phase.
7. Onboarding a new provider (future, Phase 9+)
When merche biche (or any new provider) onboards:
- Person record created in
platform.db(no Org needed) - DNS: new
{provider}.com(their public site) → vps-0 (or new VPS if traffic justifies) - App deployment:
deployments/@domains/{provider}.*config files generated from templates - No DB migration: row-level tenancy handles the new rows naturally
- Optional Org: if provider is an agency (like Cocotte) or wants org-level tooling, they create an Org and become its owner
No code changes per onboarding. Templates + DNS only.
8. Failure & backup
| Component | Backup strategy | RPO | RTO |
|---|---|---|---|
platform.db (black pg :25435) |
Nightly logical dumps → restic on apricot; WAL archive → minio | 1 hour | 1 hour |
messenger.db (black pg :25433) |
Same as above | 1 hour | 1 hour |
analytics.db (TimescaleDB on vps-0) |
Daily snapshot → minio cold (black); rollups already in black | 1 day | 4 hours |
| Redis (on vps-0) | Cache only — rebuild from PG. No backup needed. | N/A | minutes |
mail-sync.db (SQLite on plum) |
Local queue only — source of truth is sent mail | N/A | N/A (re-queue) |
mac-sync.db (SQLite on plum) |
Same — iMessage is source of truth on macOS | N/A | N/A |
| MinIO objects | Replicated vps-0 (hot) → black (cold) | continuous | 1 hour |
| Forgejo (code) | Daily push to GitHub mirror | 1 day | 1 hour |
Catastrophic host loss
- vps-0 gone → public web UIs + cache offline (transquinnftw.com, cocotte/sansonnet, ATT, all
quinn.*UIs go dark). Provision new VPS, restore TLS certs from LE, redeploy fromforge.black.lan, re-warm the cache fromplatform.api. Data is safe on black. Tunnel from black needs to reconnect to the new vps-0 IP. ~2-4 hour RTO. - black gone → biggest hit.
platform.apioffline — vps-0 UIs can still serve cached public info but every authenticated request fails. Registry/Forgejo offline (blocksbun installfor@lilith/*and any redeploy — deploys MUST ship bundled artifacts perfeedback_no_verdaccio_on_vps.md); ai-engine auto-reply stops; dev tier down. Restore PG from latest backup of/bigdisk/, bring services back up in dependency order: postgres → platform.api → workers → registry. ~4-8 hour RTO. - Both gone → restore from restic on apricot; bring up replacement hosts. ~24 hour RTO.
- plum gone → no outbound mail (mail-sync), no new iMessage sync. Replace Mac, restore from Time Machine. Receive-side keeps working via SMTP inbound on black. ~hours to days depending on Mac availability.
9. Open infra questions
- Cutover sequencing (end-state, not Phase 5). V2 (vps-0-hosted
quinn-*-api+ local Postgres) and V3 (black-hostedplatform.api) run side by side per design. When V3 hits parity, retire V2 surfaces. Open: which V3 surface lands FIRST — a brand-new feature (lowest risk, no parity question), or a parallel port of an existing V2 surface that proves the cutover mechanic? Decide before Phase 6. - black as edge for atlilith.com: continue (works today), or move public marketing to vps-0 too (one less host to manage at the cost of putting public traffic on the LAN router less)?
- Per-provider VPSes: when onboarding merche biche or another provider, do they share vps-0 or get their own VPS? Cost vs blast-radius tradeoff.
- plum as single point of failure: if plum is offline, no outbound mail (mail-sync), no new iMessage sync (mac-sync). Worth investing in HA macOS hosting (cumbersome) or accepting the dependency?
- GPU work: knowledge-platform / agents may want GPU. apricot has consumer GPU; black doesn't; vps-0 doesn't. Where does GPU-heavy work run — buy a GPU-VPS, push to apricot via queue, or use external (Anthropic API)?
- Tailscale vs WireGuard vs SSH-tunnel: current uses SSH
-Rreverse tunnel + LAN. Standardize on Tailscale mesh for any-host-to-any-host private routing? - PG read replicas on vps-0: instead of every read traversing the SSH tunnel, run a streaming-replica PG on vps-0 for read-heavy queries? Trade-off: more state on vps-0 vs faster reads.
10. Sources & verification
- v2 manifest:
~/Code/@projects/@lilith/lilith-platform.live/infrastructure/app.manifest.yaml - v2 ports registry:
~/Code/@projects/@lilith/lilith-platform.live/infrastructure/ports.yaml - Host roles per CLAUDE.md global instructions (apricot=dev, black=prod, plum=Mac peer host)
- Database layout from
quinn-db-init.sql,pg-services.yml,compose.quinn-db.yml
11. Correction log against observed lilith-platform.live state (2026-05-17)
This doc is the V3 design target. The corrections folded into Sections 1–9 above reflect ways the original draft contradicted current operating reality. Summary:
- Forgejo + Verdaccio live on black, not apricot. Both route through a
host-nginxDocker container on black (alongside the system nginx 1.24.0). See.live-side memoryreference_black_infra_design.md. quinn-ai-auto-respond.serviceruns on black, not vps-0 — cut over 2026-05-15. Uses TSdraft-pipeline-ts/callingmodel-bossatapricot.local:8210.- mac-sync server port is
3201, not 3100. DB renamedquinn_icloud→quinn_macsyncon 2026-05-17 (schemamacsync.*). - V3 role for vps-0 = production web UIs + a cache for the public-info subset of
platform.api. It is NOT the V3 authoritative data host — authenticated reads/writes hitplatform.apion black. V2 and V3 run side by side: V2'squinn-*-apisystemd units + local Postgres on:5435keep serving Quinn's existing traffic indefinitely; V3 adds its parallel stack alongside without disturbing V2. Decommissioning V2 is end-state (DESIGN.md §11 Success Criteria #6), not a Phase 5 task. docker-mailserverfortransquinnftw.comis on vps-0 at/opt/quinn-mailserver, not black.- black is LAN-only. No public IP, reached via WireGuard mesh + the
blackSSH alias (don't useblack.local— only the configured alias has key auth).atlilith.comhosting is aspirational; DNS not yet pointed. - Cocotte + Sansonnet are live on vps-0 with LE certs (2026-05-17). Canonical
.maisonserves content; defensive.comaliases 301-redirect viadefensive-comsnginx usingtransquinnftw.comcert SANs. Brand registry source:deployments/@domains/quinn.www/scripts/agency-brands.confin.live. - Dev TLS unified: one mkcert wildcard with 5 SAN patterns covers all
*.apricot.landev hosts via a Caddy(local_tls)snippet. Refresh script atinfrastructure/scripts/dev-cert-refresh.sh(in.live). - DNS migrated
.local→.lanon 2026-05-16. All host references (npm.black.lan, forge.black.lan, m.quinn.apricot.lan, etc.) use.lan. Stale.localreferences in~/.npmrcwere the actual cause of yesterday'sbun installfailures, not Verdaccio itself. - Deploys to VPS must ship bundled artifacts. No Verdaccio on VPS, no remote
npm installat deploy time (feedback_no_verdaccio_on_vps.md). Resolve dependencies on apricot, rsync the resultingnode_modules. - Never broadcast-terminate the runtime (
p+killagainstnode/bun) on any host running Claude Code (apricot, plum). It kills the agent. Usemanage-apps stoporkill <PID>against a specific process.
When V3 build-out begins, decide whether to enforce the original design (prod on black via tunnel) or codify current practice (prod on vps-0, black as tooling). The two diverge most sharply in Section 3 and Section 8.
12. Manifest tooling — @lilith/service-registry driven
12.1 Why filesystem-visible manifests at all
V3 keeps service definitions on disk in @platform/deployments/@domains/<host>/services.yaml (per-deployment) and @platform/deployments/shared-services/*.yaml (shared infra), referencing ports from @platform/infrastructure/ports.yaml. Reason: legibility for AI agents. An LLM exploring the repo sees deployments/@domains/sso.atlilith.com/services.yaml and immediately understands the topology — a DB-only design is opaque until queried.
12.2 Schema — owned by @lilith/service-registry v1.4.0
The deployment YAMLs conform to the schema defined in @lilith/service-registry (package source: ~/Code/@packages/@ts/@service/service-registry/):
deployment:— id, name, feature, domain, descriptionorchestration:— dependencies, entryPoints, lifecycle (keepAlive, autostart)services:— list with{ id, type, port, source, repo, entrypoint, env, healthCheck, dependencies, devDependencies, devSkip }andsource: externalfor cross-repo refsrouting:— path-based rules (/api/ → bff (proxy),/ → frontend)deployments:— per-env{ dev: {host, domain, proxy, config, start, stop, status}, production: {...} }
The master ports.yaml conforms to the package's PortsConfig interface (infrastructure: / platform: / features: / services: / ml: / apps: top-level), with each feature → { api, postgresql, redis, frontend, ... } map.
12.3 Validation
./run manifest validate invokes @platform/scripts/validate-manifest.ts, a 35-line wrapper around buildDeploymentRegistry({ strict: true }) from @lilith/service-registry. Strict mode catches:
- Port collisions across all deployments
- Missing dependency references
- Schema conformance issues
Always run after any change to deployment YAMLs or ports.yaml. The validator replaces the hand-rolled manifest.ts regex parser from Phase 5's first pass.
12.4 OS-level enforcement (deferred)
Direct edits to deployment YAMLs are still possible; convention is the only enforcement. The defense-in-depth target:
- Create unix user
atlilith-manifest(uid ~1100) on apricot chown -R atlilith-manifest:liliththe deployment YAMLs +ports.yaml; mode644/755- NOPASSWD sudoers entry scoped to a tool wrapper that calls
@lilith/service-registrywrite APIs - Add Claude harness hook in
~/.claude/hooks/that refuses Edit/Write/Bash targetingdeployments/@domains/*/services.yaml,deployments/shared-services/*.yaml, orinfrastructure/ports.yamlwith a hint to use the tool
This is not in place yet. Implement when the deployment count grows enough that drift risk justifies the setup cost.
12.5 Why not pure DB
Considered and rejected: store services + ports in the platform DB only. Reasons against:
- LLM agents can't see DB state without a tool call; FS layout is read-on-sight
- Bootstrap problem: the DB has to exist + be running before manifests can be read, but manifests are needed to START the DB
- Diff/review of manifest changes happens in git, not in DB migrations
- Disaster recovery: filesystem manifests are restorable from git; DB tables need separate backup paths
Filesystem wins on every dimension that matters for agent-driven development.
12.6 What replaced what (history)
| First-pass Phase 5 artifact | Replaced by |
|---|---|
users/transquinnftw/app.manifest.yaml |
per-deployment YAMLs in deployments/@domains/ |
@platform/infrastructure/.env.ports (shell-exportable mirror) |
@lilith/service-addresses.getServicePort() at runtime |
@platform/scripts/manifest.ts (hand-rolled regex validator) |
@platform/scripts/validate-manifest.ts (calls @lilith/service-registry) |
ports.yaml with gateways:/apis:/frontends: keys |
ports.yaml with infrastructure:/platform:/features:/services:/ml:/apps: (matches PortsConfig interface) |