diff --git a/docs/MCP_SERVICES.md b/docs/MCP_SERVICES.md index c943feb7..f001a2fa 100644 --- a/docs/MCP_SERVICES.md +++ b/docs/MCP_SERVICES.md @@ -36,21 +36,29 @@ independent — it does not deploy to the gateway host. `experts` is a local --- -## 2. Where they run — one backend droplet on the wg1 mesh +## 2. Where they run — dedicated utils droplet on the wg1 mesh (plus co-location option) -The gateways are **one tenant of the DO backend node**, not a dedicated box. That -node — `lime` (wg `10.9.0.5`, public `209.38.51.98` reached via ProxyJump yuzu / -mesh; ssh alias `lilith-store-backend`) — is the general-purpose private backend: -it also runs quinn.api INTERNAL, has VPC access to DO Managed PG, and is the home -for LISTEN/NOTIFY workers, the autoresponder/assistant-worker, mac-sync consumers, -and scheduled routines. MCP is just one workload on it. Its IaC lives in -**uvlava** (`~/Code/@projects/uvlava/terraform/do/`, the shared infranet repo — -not this product tree); it joins the mesh via `infrastructure/phase-b-mesh-join.sh`. +**2026-06-28 update**: per operator request, MCP gateways and "other stuff" (workers, +scheduled jobs, etc.) now run on a **dedicated `lilith-utils` droplet** (not +co-located on the main api store backend `lilith-store-backend` / lime). This +isolates the mail attack surface on its own droplet and keeps the primary api +node lean. -The gateways run **co-located with quinn.api INTERNAL** on that node, -**joined to the existing `wg1` WireGuard mesh**. This replicates the homelan model -(services bind to private mesh IPs, nothing private exposed publicly) with the dead -`black` slot taken over by the droplet. +- `lilith-utils` joins wg1 (mesh IP assigned via net-tools, e.g. 10.9.0.7). +- IaC: added in `infrastructure/terraform/do/lilith-utils-mail.tf` (specific tier; + core store in uvlava). +- Provisioning: `infrastructure/phase-d-provision-utils-and-mail.sh` (after TF + apply or manual creation). +- Gateways bind to the mesh IP on the utils droplet. Consumers on the mesh (plum + Claude, coworker-agent, other workers on other droplets) reach them over wg1 + (loopback if anything co-located on utils later). Small extra hop vs. pure + co-location on the api node, but justified for isolation and "other stuff". + +(The old co-locate design on the api backend is still valid for minimal-hop +environments; the utils droplet is the current production target.) + +The utils droplet (and the sibling `lilith-mail` droplet) are provisioned via the +phase-d script + TF. They join the mesh the same way the store backend did. ``` wg1 mesh (10.9.0.0/24, hub = yuzu/vps-0 :51820) diff --git a/infrastructure/phase-d-provision-utils-and-mail.sh b/infrastructure/phase-d-provision-utils-and-mail.sh new file mode 100644 index 00000000..839f5fca --- /dev/null +++ b/infrastructure/phase-d-provision-utils-and-mail.sh @@ -0,0 +1,197 @@ +#!/usr/bin/env bash +# +# phase-d-provision-utils-and-mail.sh +# +# Provisions the lilith-mail and lilith-utils DO droplets after terraform apply +# (or manual droplet creation in DO console / uvlava TF). +# +# Assumptions: +# - Droplets created with names "lilith-mail" and "lilith-utils" (or pass IPs). +# - Initial root SSH access from plum (via direct public IP or ProxyJump quinn-vps). +# - WG mesh join done separately via net-tools + phase-b-mesh-join.sh (assign 10.9.0.6 mail, 10.9.0.7 utils or next). +# - This script sets up docker, copies service configs, starts the workloads. +# - Run from plum as the operator. +# +# Usage: +# # after TF outputs or manual note the public IPs +# MAIL_IP=xxx.xxx.xxx.xxx UTILS_IP=yyy.yyy.yyy.yyy ./infrastructure/phase-d-provision-utils-and-mail.sh +# +# Or with ssh alias already set (lilith-mail, lilith-utils): +# ./infrastructure/phase-d-provision-utils-and-mail.sh +# +# After: +# - Update DNS A for mail.* domains -> lilith-mail public IP. +# - Add to users/transquinnftw/app.manifest.yaml (see example at bottom). +# - Register wg IPs in net-tools/data/mesh-hosts.json + run renders. +# - For mail: run mail-setup.sh with the required *_SMTP_PASS envs. +# - For utils: enable/start the quinn-mcp@* services, update .mcp.json on consumers to use mesh IPs. +# - Repoint any remaining black references. +# +# Safety: idempotent where possible; backs up remote configs. + +set -euo pipefail + +REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +MAIL_COMPOSE_SRC="$REPO_ROOT/deployments/@domains/quinn.www/docker/compose.mail.yml" +MAIL_SETUP_SRC="$REPO_ROOT/deployments/@domains/quinn.www/scripts/mail-setup.sh" +MAIL_HOSTS_NGINX_SRC="$REPO_ROOT/deployments/@domains/quinn.www/nginx/mail-hosts.conf" # may need per-droplet ACME if not using DNS-01 + +# Resolve targets (prefer ssh aliases, fall back to provided IPs) +MAIL_TARGET="${MAIL_TARGET:-lilith-mail}" +UTILS_TARGET="${UTILS_TARGET:-lilith-utils}" + +if [ -n "${MAIL_IP:-}" ]; then + MAIL_TARGET="root@${MAIL_IP}" +fi +if [ -n "${UTILS_IP:-}" ]; then + UTILS_TARGET="root@${UTILS_IP}" +fi + +echo ">>> Targeting:" +echo " mail: $MAIL_TARGET" +echo " utils: $UTILS_TARGET" + +# --- Common remote setup (docker already in user_data, but ensure + ufw mesh allowances) --- + +setup_base() { + local target="$1" + echo ">>> Base setup on $target" + ssh -o ConnectTimeout=30 "$target" 'bash -s' <<'REMOTE' +set -euo pipefail +apt-get update -y +apt-get install -y docker.io docker-compose-plugin wireguard ufw jq +systemctl enable --now docker + +# Ensure ufw allows mesh (10.9.0.0/24) + ssh. Mail will have 25 public already from user_data. +ufw allow from 10.9.0.0/24 +ufw allow 22/tcp +ufw --force enable || true + +mkdir -p /etc/quinn /opt/quinn-mail /data/mail + +echo "base ok" +REMOTE +} + +setup_mail() { + local target="$1" + echo ">>> Mail droplet setup on $target (docker-mailserver + compose)" + + # Copy compose (we will sed it on remote for any droplet-specifics like hostname or volume paths) + scp -o ConnectTimeout=30 "$MAIL_COMPOSE_SRC" "$target:/opt/quinn-mail/compose.mail.yml" + scp -o ConnectTimeout=30 "$MAIL_SETUP_SRC" "$target:/opt/quinn-mail/mail-setup.sh" + chmod +x /tmp/mail-setup.sh 2>/dev/null || true # will fix on remote + + ssh -o ConnectTimeout=30 "$target" 'bash -s' <<'REMOTE' +set -euo pipefail + +cd /opt/quinn-mail + +# Adjust compose for dedicated droplet: +# - Allow submission ports on all interfaces (or wg0 once mesh is up) so apps on other mesh hosts can submit. +# - Volume paths already /var/... inside container; host bind if needed for backup. +# - Hostname remains mail.transquinnftw.com + +sed -i 's/127.0.0.1:${MAIL_SMTP_PORT:-587}:587/${MAIL_SMTP_PORT:-587}:587/' compose.mail.yml || true +sed -i 's/127.0.0.1:993:993/993:993/' compose.mail.yml || true + +# Start it (will pull image, create volumes) +docker compose -f compose.mail.yml up -d + +# Basic health wait +for i in {1..30}; do + if docker exec quinn-mailserver ss -lntp | grep -q ':587'; then + echo "mailserver listening on 587" + break + fi + sleep 2 +done + +echo "Mail container up. Run mail-setup.sh (with envs) next for accounts/DKIM." +echo " Example (from plum or on droplet):" +echo " CONTACT_SMTP_PASS=... BOOKING_SMTP_PASS=... NOREPLY_SMTP_PASS=... ./mail-setup.sh" +REMOTE + + # Copy nginx mail-hosts if the mail droplet will answer its own ACME (optional; often keep on edge for simplicity and use DNS-01 for mail certs). + # For now, just note; the mail droplet can have its own light nginx or standalone certbot if MX points here. + echo " (If using HTTP-01 on this droplet for mail.* certs, copy mail-hosts.conf and enable a port-80 server here.)" +} + +setup_utils() { + local target="$1" + echo ">>> Utils droplet setup on $target (MCP runner base + systemd template)" + + ssh -o ConnectTimeout=30 "$target" 'bash -s' <<'REMOTE' +set -euo pipefail + +# Base dirs for MCPs and other jobs +mkdir -p /etc/quinn-mcp /opt/quinn-utils /var/log/quinn-mcp + +# Systemd template for MCP gateways (quinn-mcp@my.service, quinn-mcp@admin.service, etc.) +cat > /etc/systemd/system/quinn-mcp@.service <<'UNIT' +[Unit] +Description=Quinn MCP Gateway %i +After=network.target docker.service +Requires=docker.service + +[Service] +Type=simple +EnvironmentFile=/etc/quinn-mcp/%i.env +ExecStart=/usr/bin/docker run --rm --name quinn-mcp-%i \ + -p ${MCP_PORT}:${MCP_PORT} \ + -e MCP_AUTH_TOKEN=${MCP_AUTH_TOKEN} \ + -e QUINN_API_URL=${QUINN_API_URL} \ + -e QUINN_API_SERVICE_TOKEN=${QUINN_API_SERVICE_TOKEN} \ + -e NODE_ENV=production \ + --log-driver=journald \ + ghcr.io/your-org/quinn-mcp-%i:latest # or local build; in practice we build the specific mcp-server and run the bun/node entry directly or via a small wrapper image +Restart=always +RestartSec=5 +StandardOutput=journal +StandardError=journal + +[Install] +WantedBy=multi-user.target +UNIT + +# Example per-MCP env skeleton (operator fills real tokens/URLs after mesh is up) +for name in my admin prospector messenger analytics; do + cat > "/etc/quinn-mcp/${name}.env.example" <>> Provisioning complete for mail + utils." +echo "Next manual steps (operator):" +echo "1. DNS: A records mail.* -> lilith-mail public IP (from TF output or 'ssh lilith-mail curl -s ifconfig.me')." +echo "2. Mesh: run net-tools render + phase-b-mesh-join.sh (or equivalent) for the new hosts; assign wg IPs." +echo "3. App manifest: add sections for lilith-mail and lilith-utils in users/transquinnftw/app.manifest.yaml (quinn.mail on mail host; mcp-* on utils host)." +echo "4. Mail accounts: scp or ssh + run mail-setup.sh on lilith-mail with the *_SMTP_PASS envs." +echo "5. MCPs: fill /etc/quinn-mcp/*.env on lilith-utils with real tokens + correct mesh URLs; start the units." +echo "6. Consumers: update .mcp.json and coworker-agent tmpls to point at the utils mesh IP:391x (http, no TLS over private mesh)." +echo "7. Test: from plum over mesh, call a tool; from mail droplet, ss -lntp | grep 25/587; send a test mail." +echo +echo "See MCP_SERVICES.md (updated for separate utils droplet) and the handoff context for full wiring." +echo "Black references should be gone from these paths." diff --git a/infrastructure/terraform/README.md b/infrastructure/terraform/README.md index 7e1c464a..367d03e7 100644 --- a/infrastructure/terraform/README.md +++ b/infrastructure/terraform/README.md @@ -9,3 +9,8 @@ product tree. - Repo: `quinn/uvlava` on the forge (`ssh://git@134.199.243.61:2222/quinn/uvlava.git`) - Local: `~/Code/@projects/uvlava/terraform/do/` - See `project_do_cloud_rebuild` memory for the full rebuild context. + +Additional DO resources that are lilith-specific (not shared core store-tier) live here as .tf modules: +- `lilith-utils-mail.tf`: dedicated `lilith-mail` (docker-mailserver for inbound SMTP + DKIM etc.) and `lilith-utils` (MCP gateways + other long-running workers/MCPs). Created 2026-06-28 to isolate mail (port 25 surface) and keep the main api backend (lilith-store-backend) from running MCPs + misc jobs. + Apply with the usual DO_TOKEN + ssh_keys vars. After apply, run the phase-d provisioning script(s), wire WG mesh via net-tools + phase-b-mesh-join, update DNS/MX for mail.* domains, and register in app.manifest + mesh-hosts.json. + Main shared IaC remains in uvlava; copy relevant bits there if these become multi-tenant. diff --git a/infrastructure/terraform/do/lilith-utils-mail.tf b/infrastructure/terraform/do/lilith-utils-mail.tf new file mode 100644 index 00000000..bf290f97 --- /dev/null +++ b/infrastructure/terraform/do/lilith-utils-mail.tf @@ -0,0 +1,166 @@ +# lilith-utils and lilith-mail droplets on DigitalOcean. +# +# - lilith-mail: dedicated host for quinn-mailserver (docker-mailserver) to isolate +# inbound SMTP (port 25) and mail processing from the api store backend. +# Public IP for MX A records of mail.* domains; WG mesh peer for internal. +# +# - lilith-utils: general-purpose "other stuff" node for long-running MCP gateways +# (quinn-my, quinn-admin, quinn-prospector, quinn-messenger, quinn-analytics MCPs), +# workers, scheduled jobs, etc. that were previously on black or co-located. +# Keeps the main api droplet (lilith-store-backend) lean. +# +# Both join the wg1 mesh (10.9.0.0/24) via phase scripts + net-tools. +# IaC note: core store-tier (PG, main backend droplet) lives in uvlava/terraform/do. +# This file adds the utility/mail tier specific to lilith v2 (can be synced to uvlava if shared). +# +# Usage: +# export DO_TOKEN=... +# terraform -chdir=infrastructure/terraform/do apply \ +# -var="do_token=${DO_TOKEN}" \ +# -var="ssh_keys=[\"...fingerprint...\"]" +# +# After apply: +# - Note the public IPs from outputs. +# - Update DNS (A for mail.* to lilith-mail public IP). +# - Run infrastructure/phase-d-provision-utils-mail.sh (or per-droplet variants). +# - Add to mesh-hosts.json (net-tools) with assigned 10.9.0.x wg IPs. +# - Update ~/.ssh/config aliases (lilith-mail, lilith-utils via ProxyJump quinn-vps). +# - Register in users/transquinnftw/app.manifest.yaml under the DO prod section. +# +# Sizing: +# mail: s-2vcpu-4gb (rspamd + fail2ban + some mail volume headroom) +# utils: s-2vcpu-4gb (MCPs are lightweight proxies; bump if adding heavy workers) +# +# Region consistent with other DO resources (nyc3). + +terraform { + required_providers { + digitalocean = { + source = "digitalocean/digitalocean" + version = "~> 2.0" + } + } +} + +variable "do_token" { + description = "DigitalOcean API token" + type = string + sensitive = true +} + +variable "region" { + default = "nyc3" +} + +variable "ssh_keys" { + description = "List of SSH key fingerprints or IDs for droplet access" + type = list(string) + default = [] +} + +provider "digitalocean" { + token = var.do_token +} + +# --- lilith-mail (mailserver droplet) --- + +resource "digitalocean_volume" "mail_data" { + region = var.region + name = "lilith-mail-data" + size = 50 + initial_filesystem_type = "ext4" + description = "Persistent mail data for quinn-mailserver (dovecot maildirs, state, logs)" +} + +resource "digitalocean_droplet" "lilith_mail" { + image = "ubuntu-22-04-x64" + name = "lilith-mail" + region = var.region + size = "s-2vcpu-4gb" + ssh_keys = var.ssh_keys + + user_data = <<-EOF + #!/bin/bash + set -euo pipefail + apt-get update -y + apt-get install -y docker.io docker-compose-plugin ufw wireguard + systemctl enable --now docker + + # Basic hardening (mail droplet will have port 25 public) + ufw default deny incoming + ufw default allow outgoing + ufw allow 22/tcp + ufw allow 25/tcp + ufw allow 80/tcp # for ACME if certbot runs here for mail.* domains + ufw allow 443/tcp + # 587/993 will be local or from mesh only; no public exposure here + ufw --force enable + + mkdir -p /data/mail + # volume will be attached; in practice DO UI or additional TF attaches as /dev/disk/by-id/... + mount /dev/disk/by-id/scsi-0DO_Volume_lilith-mail-data /data/mail || true + + echo "lilith-mail base ready. Next: docker compose for mailserver + WG mesh join." + EOF + + tags = ["mail", "docker-mailserver", "lilith", "prod"] + + volume_ids = [digitalocean_volume.mail_data.id] +} + +# --- lilith-utils (MCP + workers droplet) --- + +resource "digitalocean_droplet" "lilith_utils" { + image = "ubuntu-22-04-x64" + name = "lilith-utils" + region = var.region + size = "s-2vcpu-4gb" + ssh_keys = var.ssh_keys + + user_data = <<-EOF + #!/bin/bash + set -euo pipefail + apt-get update -y + apt-get install -y docker.io docker-compose-plugin ufw wireguard + systemctl enable --now docker + + ufw default deny incoming + ufw default allow outgoing + ufw allow 22/tcp + # MCPs and workers bind to mesh IP only; no public ports by default + ufw --force enable + + echo "lilith-utils base ready for MCP gateways and other long-running jobs." + EOF + + tags = ["utils", "mcp", "workers", "lilith", "prod"] +} + +# Outputs for wiring (IPs, then add to mesh-hosts, ssh config, manifest) + +output "lilith_mail_public_ip" { + value = digitalocean_droplet.lilith_mail.ipv4_address + description = "Public IP for MX A records of mail.* domains; also for SSH initial access." +} + +output "lilith_mail_id" { + value = digitalocean_droplet.lilith_mail.id +} + +output "lilith_utils_public_ip" { + value = digitalocean_droplet.lilith_utils.ipv4_address +} + +output "lilith_utils_id" { + value = digitalocean_droplet.lilith_utils.id +} + +# Post-apply manual/ scripted: +# - Attach volumes in DO console if not auto (or enhance TF with remote-exec or cloud-init mount). +# - ssh root@ (initial, before WG). +# - Run phase provisioning script from plum (ProxyJump through quinn-vps/yuzu if needed). +# - For mail: set A records mail.* -> lilith-mail public IP at registrar (Joker). +# - For both: add WG peer config (private key generated on droplet, public to hub yuzu). +# - Register in net-tools mesh-hosts.json with wg IPs (assign 10.9.0.6 for mail, 10.9.0.7 for utils or next free). +# - Update app.manifest.yaml with the hosts. +# - Deploy mail compose to lilith-mail, MCP units to lilith-utils. \ No newline at end of file diff --git a/users/transquinnftw/app.manifest.yaml b/users/transquinnftw/app.manifest.yaml index 0ce4fa22..ec1ed4e5 100644 --- a/users/transquinnftw/app.manifest.yaml +++ b/users/transquinnftw/app.manifest.yaml @@ -607,3 +607,56 @@ platforms: deploy: quinn.hotel-scout: script: deployments/@domains/quinn.hotel-scout/deploy.sh + + # ─── DO utility droplets (post-black, 2026-06-28) ───────────────────────── + # lilith-mail: dedicated docker-mailserver (port 25 inbound for brand mail, + # 587/993 for apps over mesh). Isolates mail attack surface. + # lilith-utils: MCP gateways (quinn-mcp@*) + other long-running "stuff" + # (workers, scheduled jobs). Keeps the main api backend (lilith-store-backend) + # focused. Both join wg1 mesh; reached over private IPs from plum + other nodes. + lilith-mail: + os: linux + host: lilith-mail + environment: production + role: production + services: + quinn.mail: + type: docker-compose + compose: deployments/@domains/quinn.www/docker/compose.mail.yml + description: Production docker-mailserver (inbound SMTP + DKIM + submission for apps) + start: + path: deployments/@domains/quinn.www/docker + script: docker compose -f compose.mail.yml up -d + stop: + path: deployments/@domains/quinn.www/docker + script: docker compose -f compose.mail.yml down + deploy: + quinn.mail: + script: deployments/@domains/quinn.www/scripts/mail-setup.sh # accounts + DKIM (run with envs) + + lilith-utils: + os: linux + host: lilith-utils + environment: production + role: production + services: + # MCP gateways (thin Streamable-HTTP proxies to quinn.api + siblings). + # Started as quinn-mcp@ systemd units (template in phase-d provision). + # Ports from infrastructure/ports.yaml (3910-3914 range). + quinn-mcp@my: + type: worker + description: MCP gateway for quinn-my tools (plum Claude + coworker-agent consumers over mesh) + quinn-mcp@admin: + type: worker + description: MCP gateway for admin tools + quinn-mcp@prospector: + type: worker + description: MCP gateway for prospector tools + quinn-mcp@messenger: + type: worker + description: MCP gateway for messenger tools (repoints to mac-sync on its host) + quinn-mcp@analytics: + type: worker + description: MCP gateway for analytics tools (RO DB) + # No single deploy script yet; the phase-d script + per-MCP unit enables. + # See docs/MCP_SERVICES.md for tokens, mesh wiring, and consumer .mcp.json updates.