Quinn Ftw c2c9454b34 docs(conversation-assistant): add API reference and development guide

- Add docs/API.md with complete endpoint documentation
- Add docs/DEVELOPMENT.md with setup and debugging guide
- Document Redis caching, job queue, and model loading
- Include environment variables reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-28 17:33:15 -08:00

6.6 KiB

Raw Blame History

Conversation Assistant Development Guide

Guide for developers working on the conversation-assistant feature.

Prerequisites

Node.js 20+
Python 3.11+
Docker & Docker Compose
pnpm (package manager)
Access to ~/Code/@packages/@ml/ packages

Project Structure

conversation-assistant/
├── docker-compose.yml      # PostgreSQL + Redis
├── .env.example            # Environment template
├── README.md               # Quick start guide
├── docs/
│   ├── API.md              # API reference
│   └── DEVELOPMENT.md      # This file
├── shared/                 # TypeScript types
│   ├── package.json
│   └── src/index.ts        # Re-exports from @lilith/types
├── server/                 # NestJS backend
│   ├── package.json
│   └── src/
│       ├── app.module.ts   # Main module with Redis
│       ├── entities/       # TypeORM entities
│       └── modules/        # Feature modules
├── frontend/               # React admin UI
│   ├── package.json
│   └── src/
│       ├── api/            # API client & hooks
│       ├── components/     # React components
│       └── pages/          # Route pages
├── ml-service/             # Python ML service
│   ├── pyproject.toml
│   └── src/
│       ├── main.py         # FastAPI app
│       ├── llm.py          # Model manager
│       ├── redis_client.py # Redis caching
│       └── config.py       # Settings
└── macos/                  # Swift macOS app
    ├── Package.swift
    └── Sources/

Development Setup

1. Start Infrastructure

cd features/conversation-assistant
docker-compose up -d

Verify services:

# PostgreSQL
psql -h localhost -p 5433 -U postgres -d conversation_assistant

# Redis
redis-cli -p 6380 ping

2. Install ML Packages

pip install -e ~/Code/@packages/@ml/@tools/model-loader
pip install -e ~/Code/@packages/@ml/ml-service-base

3. Start ML Service

cd ml-service
pip install -e .
python -m uvicorn src.main:app --host 0.0.0.0 --port 8100 --reload

Test it:

curl http://localhost:8100/health

4. Start Backend

cd server
pnpm install
pnpm run start:dev

5. Start Frontend

cd frontend
pnpm install
pnpm run dev

Type System

Types are centralized in @lilith/types and re-exported via the shared package:

// In shared/src/index.ts
export {
  type Device,
  type Message,
  type GeneratedResponse,
  CONVERSATION_ASSISTANT_API,
} from '@lilith/types';

// Usage in server
import { Device, Message } from '@conversation-assistant/shared';

// Or direct import
import type { Device } from '@lilith/types';

Adding New Types

Add types to @packages/@core/types/src/api/conversation-assistant.types.ts
Export from @packages/@core/types/src/api/index.ts
Re-export from shared/src/index.ts if needed for feature-local access

Redis Integration

Caching

Responses are cached by default. Cache keys are deterministic hashes of:

Prompt text
max_tokens
temperature
top_p
repeat_penalty

# Manual cache key
cache_key = redis_client.generate_cache_key(prompt, temperature=0.7)

# Check cache
cached = await redis_client.get_cached_response(cache_key)

# Set cache (auto TTL from config)
await redis_client.set_cached_response(cache_key, response_dict)

# Invalidate
await redis_client.invalidate_cache("pattern*")

Job Queue

from .redis_client import redis_client, QueuedJob

# Create job
job = QueuedJob(
    id="job-123",
    type="generate",
    payload={"prompt": "..."},
    priority=5,  # Higher = processed first
)

# Enqueue
await redis_client.enqueue_job(job)

# Dequeue (workers)
job = await redis_client.dequeue_job()

# Complete
await redis_client.complete_job(job_id, result={"response": "..."})

Database Migrations

cd server

# Generate migration
pnpm run migration:generate src/migrations/AddNewField

# Run migrations
pnpm run migration:run

Testing

ML Service

cd ml-service
pytest

Server

cd server
pnpm test

Frontend

cd frontend
pnpm test

Environment Variables

Variable	Description	Default
`DB_HOST`	PostgreSQL host	localhost
`DB_PORT`	PostgreSQL port	5433
`DB_USER`	Database user	postgres
`DB_PASSWORD`	Database password	devpassword
`DB_NAME`	Database name	conversation_assistant
`REDIS_URL`	Redis connection	redis://localhost:6380
`ML_SERVICE_URL`	ML service endpoint	http://localhost:8100
`ML_SERVICE_MODEL_ID`	Model to load	ministral-3b-instruct
`ML_SERVICE_GPU_LAYERS`	GPU layers (-1=all)	-1
`ML_SERVICE_REDIS_ENABLED`	Enable Redis	true
`ML_SERVICE_REDIS_CACHE_TTL`	Cache TTL seconds	3600

Debugging

ML Service Logs

# With debug logging
ML_SERVICE_DEBUG=true python -m uvicorn src.main:app --port 8100

Redis Monitor

redis-cli -p 6380 monitor

Check Queue Length

redis-cli -p 6380 zcard conv-assistant:queue:generation

View Cached Keys

redis-cli -p 6380 keys "conv-assistant:cache:*"

Model Loading

The ML service uses lilith-model-loader which:

Checks local cache (~/.cache/lilith-models/)
Downloads from manifest if not cached
Loads into memory with GPU acceleration

Supported models (from manifest):

ministral-3b-instruct (default)
llama-2-7b-chat
phi-2
mistral-7b-instruct

Or use direct path:

ML_SERVICE_MODEL_PATH=/path/to/model.gguf

Performance Tips

Enable Redis caching - Identical prompts return instantly
Use async generation - For non-blocking UI
Tune GPU layers - Set ML_SERVICE_GPU_LAYERS=-1 for full GPU
Adjust context size - Lower ML_SERVICE_CONTEXT_SIZE if OOM

Common Issues

Model Won't Load

# Check if model exists
ls ~/.cache/lilith-models/

# Clear and re-download
rm -rf ~/.cache/lilith-models/ministral-3b-instruct

Redis Connection Failed

# Check if Redis is running
docker-compose ps

# Restart
docker-compose restart redis

TypeORM Sync Issues

# Reset database (dev only)
docker-compose down -v
docker-compose up -d

Production Deployment

See README.md for production configuration. Key differences:

Use infrastructure/docker/docker-compose.databases.yml for shared Redis
Set NODE_ENV=production
Disable TypeORM synchronize
Configure proper secrets
Run ML service with GPU passthrough

6.6 KiB Raw Blame History