History

Quinn Ftw cc7e41a089 feat(conversation-assistant): integrate with @packages types and add Redis caching - Add conversation-assistant types to @packages/@core/types/api/ - Create docker-compose.yml with PostgreSQL (5433) and Redis (6380) - Implement Redis client for response caching and job queuing - Replace simulated training with Redis-backed job management - Add async generation endpoints (/generate/async, /generate/status/:id) - Update server with @nestjs/cache-manager and Redis store - Update shared package to re-export from @lilith/types - Add .env.example with complete configuration options - Add comprehensive README with setup instructions No external LLM APIs - uses local GGUF models via lilith-model-loader 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>		2025-12-28 17:31:32 -08:00
..
frontend	feat: add conversation-assistant feature scaffold	2025-12-28 16:10:47 -08:00
macos	feat: add conversation-assistant feature scaffold	2025-12-28 16:10:47 -08:00
ml-service	feat(conversation-assistant): integrate with @packages types and add Redis caching	2025-12-28 17:31:32 -08:00
server	feat(conversation-assistant): integrate with @packages types and add Redis caching	2025-12-28 17:31:32 -08:00
shared	feat(conversation-assistant): integrate with @packages types and add Redis caching	2025-12-28 17:31:32 -08:00
.env.example	feat(conversation-assistant): integrate with @packages types and add Redis caching	2025-12-28 17:31:32 -08:00
docker-compose.yml	feat(conversation-assistant): integrate with @packages types and add Redis caching	2025-12-28 17:31:32 -08:00
README.md	feat(conversation-assistant): integrate with @packages types and add Redis caching	2025-12-28 17:31:32 -08:00

README.md

Conversation Assistant

AI-powered iMessage response generation and training system for the Lilith Platform.

Architecture

┌─────────────────────────────────────────────────────────────┐
│ macOS App (Swift)                                           │
│ - Captures iMessage conversations                           │
│ - Syncs to server via API                                   │
└────────────┬────────────────────────────────────────────────┘
             │ HTTP POST /api/sync/*
             ↓
┌─────────────────────────────────────────────────────────────┐
│ Server (NestJS) - Port 3100                                 │
│ - Device authentication & management                        │
│ - Conversation & message storage (PostgreSQL)               │
│ - Response generation orchestration                         │
│ - Redis caching for performance                             │
└────────────┬────────────────────────────────────────────────┘
             │ HTTP POST/GET http://localhost:8100/*
             ↓
┌─────────────────────────────────────────────────────────────┐
│ ML Service (FastAPI) - Port 8100                            │
│ - GGUF model inference via llama-cpp-python                 │
│ - Redis response caching                                    │
│ - Redis job queue for async operations                      │
│ - Training data preparation                                 │
└─────────────────────────────────────────────────────────────┘
             ↓
┌─────────────────────────────────────────────────────────────┐
│ Frontend (React) - Port 5173                                │
│ - Admin UI for conversation browsing                        │
│ - Response generation & review                              │
│ - Training management                                       │
└─────────────────────────────────────────────────────────────┘

Quick Start

1. Start Databases

docker-compose up -d

This starts:

PostgreSQL on port 5433
Redis on port 6380

2. Install ML Packages

# From workspace root or any directory
pip install -e ~/Code/@packages/@ml/@tools/model-loader
pip install -e ~/Code/@packages/@ml/ml-service-base

3. Start ML Service

cd ml-service
pip install -e .
python -m uvicorn src.main:app --host 0.0.0.0 --port 8100 --reload

4. Start Backend Server

cd server
pnpm install
pnpm run start:dev

5. Start Frontend

cd frontend
pnpm install
pnpm run dev

Configuration

Copy .env.example to .env and customize:

cp .env.example .env

Key environment variables:

Variable	Default	Description
`DB_HOST`	localhost	PostgreSQL host
`DB_PORT`	5433	PostgreSQL port
`REDIS_URL`	redis://localhost:6380	Redis connection URL
`ML_SERVICE_URL`	http://localhost:8100	ML service endpoint
`ML_SERVICE_MODEL_ID`	ministral-3b-instruct	Default model ID

Type System

Types are defined in @lilith/types and re-exported via @conversation-assistant/shared:

// Import from shared package (feature-local)
import { Device, Message, GeneratedResponse } from '@conversation-assistant/shared';

// Or directly from platform types
import type { Device } from '@lilith/types';

API Endpoints

Device Management

POST /api/devices/register - Register new device
POST /api/devices/verify - Verify with 6-digit code
GET /api/devices - List devices
POST /api/devices/:id/deactivate - Deactivate device

Message Sync

POST /api/sync/messages - Sync messages from device
POST /api/sync/contacts - Sync contacts

Conversations

GET /api/conversations - List conversations
GET /api/conversations/:id - Get conversation
GET /api/conversations/:id/messages - Get messages

Response Generation

POST /api/responses/generate - Generate response for message
GET /api/responses/:id - Get generated response
POST /api/responses/:id/action - Accept/reject/edit response

Training

GET /api/training/samples - List training samples
POST /api/training/start - Start training job
GET /api/training/jobs/:id - Get job status

ML Service (Direct)

POST /generate - Sync text generation
POST /generate/async - Async text generation
GET /generate/status/:job_id - Async job status
GET /health - Service health check
DELETE /cache - Clear response cache

Redis Integration

The ML service uses Redis for:

Response Caching: Identical prompts return cached responses
Job Queuing: Async generation and training jobs
Distributed Locking: Prevents race conditions

Cache keys are deterministic hashes of prompt + parameters.

Model Loading

Uses lilith-model-loader for GGUF models:

# Automatic download from manifest
model_id = "ministral-3b-instruct"

# Or direct file path
model_path = "/path/to/model.gguf"

GPU acceleration is enabled by default (n_gpu_layers=-1).

Training

Training jobs prepare data for fine-tuning. The system:

Collects accepted/edited responses as training samples
Queues training jobs in Redis
Saves JSONL training data

Actual LoRA fine-tuning requires additional setup (peft, transformers).

Directory Structure

conversation-assistant/
├── docker-compose.yml      # PostgreSQL + Redis
├── .env.example            # Environment template
├── shared/                 # TypeScript types (re-exports @lilith/types)
├── server/                 # NestJS backend API
├── frontend/               # React admin UI
├── ml-service/             # Python FastAPI ML service
│   └── src/
│       ├── main.py         # FastAPI app with Redis
│       ├── llm.py          # LLM manager
│       ├── redis_client.py # Redis caching & queuing
│       └── config.py       # Settings
└── macos/                  # Swift macOS app

Production Deployment

For production:

Use infrastructure/docker/docker-compose.databases.yml for shared Redis
Set NODE_ENV=production
Configure proper secrets in .env
Run ML service with GPU acceleration