platform-docs/dev-environment/external-app-setup.md

370 lines
10 KiB
Markdown

# External Application Setup
This document describes how to set up external applications required by the Lilith Platform development environment.
## Overview
The development environment integrates with two external applications:
| Application | Location | Integration | Purpose |
|-------------|----------|-------------|---------|
| **@model-boss** | `~/Code/@applications/@model-boss` | systemd | GPU/VRAM lease coordinator |
| **@imajin** | `~/Code/@applications/@imajin` | Docker | AI image generation pipeline |
## @model-boss Setup
Model Boss is a GPU coordinator that manages VRAM leases across multiple ML services. It runs as a **systemd service at host level** (not in Docker).
### Prerequisites
The @model-boss repository needs the following files:
```
@model-boss/
├── install # Install systemd service
├── upgrade # Git pull + restart daemon
└── infrastructure/
└── model-boss.service # Systemd unit file
```
### Creating Required Files
#### 1. `install` script
Create `~/Code/@applications/@model-boss/install`:
```bash
#!/bin/bash
# Install model-boss as a systemd service
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SERVICE_FILE="${SCRIPT_DIR}/infrastructure/model-boss.service"
SYSTEMD_DIR="/etc/systemd/system"
echo "Installing model-boss systemd service..."
# Verify service file exists
if [[ ! -f "$SERVICE_FILE" ]]; then
echo "Error: $SERVICE_FILE not found"
exit 1
fi
# Copy service file
sudo cp "$SERVICE_FILE" "$SYSTEMD_DIR/model-boss.service"
# Reload systemd
sudo systemctl daemon-reload
# Enable service (start on boot)
sudo systemctl enable model-boss
# Start service
sudo systemctl start model-boss
echo "✅ model-boss installed and started"
echo " Status: sudo systemctl status model-boss"
echo " Logs: sudo journalctl -u model-boss -f"
```
#### 2. `upgrade` script
Create `~/Code/@applications/@model-boss/upgrade`:
```bash
#!/bin/bash
# Upgrade model-boss: pull latest and restart daemon
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
echo "Upgrading model-boss..."
# Pull latest
git pull --rebase
# Rebuild if needed
pnpm install
pnpm build:packages
# Reinstall service file (in case it changed)
sudo cp infrastructure/model-boss.service /etc/systemd/system/
sudo systemctl daemon-reload
# Restart service
sudo systemctl restart model-boss
echo "✅ model-boss upgraded and restarted"
sudo systemctl status model-boss --no-pager
```
#### 3. `infrastructure/model-boss.service`
Create `~/Code/@applications/@model-boss/infrastructure/model-boss.service`:
```ini
[Unit]
Description=Model Boss Coordinator - GPU/VRAM lease coordinator
After=network.target redis.service
Wants=redis.service
[Service]
Type=simple
User=lilith
Group=lilith
WorkingDirectory=/var/home/lilith/Code/@applications/@model-boss/services/coordinator/service
ExecStart=/var/home/lilith/.local/share/pnpm/pnpm exec model-boss-coordinator
Environment=MODEL_BOSS_PORT=8210
Environment=MODEL_BOSS_HOST=0.0.0.0
Environment=MODEL_BOSS_REDIS_URL=redis://localhost:6379
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
```
### Installation
```bash
cd ~/Code/@applications/@model-boss
chmod +x install upgrade
./install
```
### Verify Installation
```bash
# Check status
sudo systemctl status model-boss
# Check health endpoint (port loaded from @model-boss/infrastructure/ports.yaml)
curl http://localhost:8210/health
# View logs
sudo journalctl -u model-boss -f
```
### Updating
```bash
cd ~/Code/@applications/@model-boss
./upgrade
```
---
## @imajin Setup
Imajin is an AI image generation pipeline that requires GPU access. It runs in **Docker with CUDA passthrough**.
### Prerequisites
1. **NVIDIA Container Toolkit** must be installed
2. **Dockerfiles** must exist for each service
### Required Dockerfiles
The following Dockerfiles need to exist in @imajin:
| Service | Dockerfile | Status |
|---------|------------|--------|
| imajin-api | `services/imajin-api/Dockerfile` | **MISSING** |
| imajin-diffusion | `services/imajin-diffusion/Dockerfile` | ✅ Exists |
| imajin-prompt | `services/imajin-prompt/Dockerfile` | **MISSING** |
| imajin-processing | `services/imajin-processing/Dockerfile` | **MISSING** |
### Creating Dockerfiles
#### Template for Python GPU Services
Use this template for `imajin-prompt`, `imajin-diffusion` (if updating), etc:
```dockerfile
# Example: services/imajin-prompt/Dockerfile
FROM python:3.11-slim AS base
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
&& rm -rf /var/lib/apt/lists/*
RUN useradd --create-home --shell /bin/bash app
WORKDIR /app
FROM base AS builder
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential git \
&& rm -rf /var/lib/apt/lists/*
# PyTorch with CUDA
RUN pip install --upgrade pip && \
pip install torch --index-url https://download.pytorch.org/whl/cu121
# Service dependencies
COPY pyproject.toml ./
RUN pip install -e . --no-deps
FROM base AS production
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY --chown=app:app . /app/
USER app
ENV PORT=8053
EXPOSE 8053
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:${PORT}/health || exit 1
CMD ["uvicorn", "src.api.main:app", "--host", "0.0.0.0", "--port", "8053"]
```
#### Template for Node.js Services
Use this for `imajin-api` (NestJS orchestrator):
```dockerfile
# services/imajin-api/Dockerfile
FROM node:20-alpine AS base
RUN apk add --no-cache libc6-compat curl
WORKDIR /app
FROM base AS builder
COPY package.json pnpm-lock.yaml ./
RUN corepack enable pnpm && pnpm install --frozen-lockfile
COPY . .
RUN pnpm build
FROM base AS production
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
ENV PORT=8080
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -f http://localhost:${PORT}/health || exit 1
CMD ["node", "dist/main.js"]
```
### NVIDIA Container Toolkit
Ensure the NVIDIA Container Toolkit is installed:
```bash
# Fedora/RHEL
sudo dnf install nvidia-container-toolkit
# Ubuntu/Debian
sudo apt install nvidia-container-toolkit
# Verify
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
```
### Starting Imajin Services
Imajin services are started automatically with `./run dev --gpu`:
```bash
# Start with GPU services
./run dev --gpu
# Check status
./run dev status
```
### Port Mapping
Imajin ports are remapped from their defaults to avoid conflicts:
| Service | Default Port | Lilith Port | Reason |
|---------|-------------|-------------|--------|
| imajin-diffusion | 8002 | 8052 | 8002 conflicts with image-generation |
| imajin-prompt | 8003 | 8053 | 8003 conflicts with SEO frontend |
| imajin-processing | 8005 | 8055 | Consistency with other remaps |
---
## Integration Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ lilith-platform │
│ │
│ docker-compose.dev-all.yml │
│ ├── Infrastructure (postgres, redis, meilisearch, minio) │
│ ├── nginx (.local domain routing) │
│ └── @imajin services (with GPU passthrough) │
│ ├── imajin-api :8080 │
│ ├── imajin-diffusion :8052 (GPU) │
│ ├── imajin-prompt :8053 (GPU) │
│ └── imajin-processing :8055 │
│ │
└───────────────────────────────┬─────────────────────────────────┘
│ Containers access model-boss via
│ host.docker.internal:8210
┌───────────────────────────────▼─────────────────────────────────┐
│ Host Level (systemd) │
│ │
│ model-boss.service │
│ └── model-boss-coordinator :8210 │
│ └── Coordinates GPU leases across all ML services │
│ │
└─────────────────────────────────────────────────────────────────┘
```
---
## Troubleshooting
### model-boss won't start
1. Check logs: `sudo journalctl -u model-boss -f`
2. Verify Redis is running: `docker ps | grep redis`
3. Check port availability: `ss -tlnp | grep 8210`
### Imajin GPU services fail
1. Check NVIDIA driver: `nvidia-smi`
2. Check Docker GPU support: `docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi`
3. Check service logs: `docker logs lilith-dev-imajin-diffusion`
### DNS not resolving
Run the DNS setup script:
```bash
sudo ./infrastructure/scripts/dev-setup/setup-local-dns.sh
```
---
## Quick Reference
| Action | Command |
|--------|---------|
| Start everything | `./run dev` |
| Start with GPU | `./run dev --gpu` |
| Check status | `./run dev status` |
| Stop everything | `./run dev stop` |
| model-boss logs | `sudo journalctl -u model-boss -f` |
| Imajin logs | `docker logs lilith-dev-imajin-api` |