docs(conversation-assistant): add API reference and development guide

- Add docs/API.md with complete endpoint documentation - Add docs/DEVELOPMENT.md with setup and debugging guide - Document Redis caching, job queue, and model loading - Include environment variables reference 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 17:33:15 -08:00 · 2025-12-28 17:33:15 -08:00 · c2c9454b34
commit c2c9454b34
parent e89fee61b3
2 changed files with 764 additions and 0 deletions
--- a/features/conversation-assistant/docs/API.md
+++ b/features/conversation-assistant/docs/API.md
@ -0,0 +1,444 @@
+# Conversation Assistant API Reference
+
+Complete API documentation for the conversation-assistant feature.
+
+## Server API (NestJS - Port 3100)
+
+### Authentication
+
+All endpoints except `/api/devices/register` require JWT authentication.
+
+```
+Authorization: Bearer <token>
+```
+
+---
+
+## Device Endpoints
+
+### Register Device
+
+```http
+POST /api/devices/register
+Content-Type: application/json
+
+{
+  "name": "MacBook Pro",
+  "hardwareId": "ABC123-DEF456",
+  "platform": "macos",
+  "osVersion": "14.0"
+}
+```
+
+**Response:**
+```json
+{
+  "deviceId": "uuid",
+  "code": "123456",
+  "expiresAt": "2024-01-01T00:10:00Z"
+}
+```
+
+### Verify Device
+
+```http
+POST /api/devices/verify
+Content-Type: application/json
+
+{
+  "deviceId": "uuid",
+  "code": "123456"
+}
+```
+
+**Response:**
+```json
+{
+  "token": "eyJhbGc...",
+  "expiresAt": "2024-01-08T00:00:00Z"
+}
+```
+
+### List Devices
+
+```http
+GET /api/devices
+Authorization: Bearer <token>
+```
+
+### Deactivate Device
+
+```http
+POST /api/devices/:id/deactivate
+Authorization: Bearer <token>
+```
+
+---
+
+## Sync Endpoints
+
+### Sync Messages
+
+```http
+POST /api/sync/messages
+Authorization: Bearer <token>
+Content-Type: application/json
+
+{
+  "conversationImessageId": "iMessage-thread-id",
+  "conversationDisplayName": "John Doe",
+  "isGroup": false,
+  "participantIds": ["contact-id-1"],
+  "messages": [
+    {
+      "imessageGuid": "msg-guid",
+      "senderId": "contact-id-1",
+      "direction": "incoming",
+      "messageType": "text",
+      "text": "Hello!",
+      "sentAt": "2024-01-01T12:00:00Z"
+    }
+  ]
+}
+```
+
+### Sync Contacts
+
+```http
+POST /api/sync/contacts
+Authorization: Bearer <token>
+Content-Type: application/json
+
+{
+  "contacts": [
+    {
+      "appleId": "john@icloud.com",
+      "phoneNumber": "+1234567890",
+      "email": "john@example.com",
+      "displayName": "John Doe",
+      "avatarHash": null
+    }
+  ]
+}
+```
+
+---
+
+## Conversation Endpoints
+
+### List Conversations
+
+```http
+GET /api/conversations?page=1&pageSize=20
+Authorization: Bearer <token>
+```
+
+**Response:**
+```json
+{
+  "conversations": [...],
+  "total": 100,
+  "page": 1,
+  "pageSize": 20
+}
+```
+
+### Get Conversation Messages
+
+```http
+GET /api/conversations/:id/messages?page=1&pageSize=50
+Authorization: Bearer <token>
+```
+
+---
+
+## Response Generation Endpoints
+
+### Generate Response
+
+```http
+POST /api/responses/generate
+Authorization: Bearer <token>
+Content-Type: application/json
+
+{
+  "messageId": "uuid",
+  "context": {
+    "maxHistory": 10,
+    "includeContactInfo": true,
+    "temperature": 0.7,
+    "maxTokens": 256
+  }
+}
+```
+
+**Response:**
+```json
+{
+  "responseId": "uuid",
+  "status": "completed",
+  "response": "Generated text...",
+  "confidence": 0.85,
+  "modelVersion": "ministral-3b-instruct",
+  "tokensUsed": 42
+}
+```
+
+### Response Action (Accept/Reject/Edit)
+
+```http
+POST /api/responses/:id/action
+Authorization: Bearer <token>
+Content-Type: application/json
+
+{
+  "action": "accept"
+}
+```
+
+Or with edit:
+
+```json
+{
+  "action": "edit",
+  "editedResponse": "Modified response text"
+}
+```
+
+Or with rejection:
+
+```json
+{
+  "action": "reject",
+  "rejectionReason": "Too formal"
+}
+```
+
+---
+
+## Training Endpoints
+
+### List Training Samples
+
+```http
+GET /api/training/samples?page=1&pageSize=20
+Authorization: Bearer <token>
+```
+
+### Start Training Job
+
+```http
+POST /api/training/start
+Authorization: Bearer <token>
+Content-Type: application/json
+
+{
+  "baseModel": "ministral-3b-instruct",
+  "epochs": 3,
+  "learningRate": 0.0001,
+  "sampleIds": ["uuid1", "uuid2"]
+}
+```
+
+### Get Training Job Status
+
+```http
+GET /api/training/jobs/:id
+Authorization: Bearer <token>
+```
+
+**Response:**
+```json
+{
+  "job": {
+    "id": "uuid",
+    "status": "training",
+    "progress": 45.0,
+    "error": null
+  }
+}
+```
+
+### Cancel Training Job
+
+```http
+POST /api/training/jobs/:id/cancel
+Authorization: Bearer <token>
+```
+
+---
+
+## ML Service API (FastAPI - Port 8100)
+
+### Health Check
+
+```http
+GET /health
+```
+
+**Response:**
+```json
+{
+  "status": "healthy",
+  "model_loaded": true,
+  "model_version": "ministral-3b-instruct",
+  "redis_connected": true,
+  "queue_length": 0
+}
+```
+
+### Synchronous Generation
+
+```http
+POST /generate
+Content-Type: application/json
+
+{
+  "prompt": "Them: How are you?\nMe:",
+  "max_tokens": 256,
+  "temperature": 0.7,
+  "top_p": 0.9,
+  "repeat_penalty": 1.1,
+  "stop": ["\nThem:", "\nMe:"],
+  "cache_key": "optional-custom-key"
+}
+```
+
+**Response:**
+```json
+{
+  "response": "I'm doing great, thanks for asking!",
+  "confidence": 0.82,
+  "model_version": "ministral-3b-instruct",
+  "tokens_used": 12,
+  "cached": false
+}
+```
+
+### Asynchronous Generation
+
+```http
+POST /generate/async
+Content-Type: application/json
+
+{
+  "prompt": "Them: What's your favorite color?\nMe:",
+  "max_tokens": 256
+}
+```
+
+**Response:**
+```json
+{
+  "job_id": "uuid",
+  "status": "queued"
+}
+```
+
+### Check Async Job Status
+
+```http
+GET /generate/status/:job_id
+```
+
+**Response:**
+```json
+{
+  "job_id": "uuid",
+  "status": "completed",
+  "result": {
+    "response": "I love blue!",
+    "confidence": 0.78
+  },
+  "error": null,
+  "created_at": "2024-01-01T12:00:00Z",
+  "completed_at": "2024-01-01T12:00:02Z"
+}
+```
+
+### Start Training
+
+```http
+POST /training/start
+Content-Type: application/json
+
+{
+  "job_id": "custom-job-id",
+  "base_model": "ministral-3b-instruct",
+  "samples": [
+    {"input": "How are you?", "output": "Great!", "quality": 1.0}
+  ],
+  "epochs": 3,
+  "learning_rate": 0.0001
+}
+```
+
+### Get Training Status
+
+```http
+GET /training/status/:job_id
+```
+
+### Cancel Training
+
+```http
+POST /training/cancel/:job_id
+```
+
+### Reload Model
+
+```http
+POST /model/reload?model_id=new-model-id
+```
+
+### Clear Cache
+
+```http
+DELETE /cache?pattern=*
+```
+
+**Response:**
+```json
+{
+  "invalidated": 42
+}
+```
+
+---
+
+## Error Responses
+
+All endpoints return errors in this format:
+
+```json
+{
+  "statusCode": 400,
+  "message": "Error description",
+  "error": "Bad Request"
+}
+```
+
+Common status codes:
+- `400` - Bad request (validation error)
+- `401` - Unauthorized (missing/invalid token)
+- `403` - Forbidden (device not authorized)
+- `404` - Not found
+- `409` - Conflict (e.g., job ID already exists)
+- `503` - Service unavailable (model not loaded, Redis down)
+
+---
+
+## Rate Limiting
+
+- Device verification: 5 attempts per 15 minutes
+- Generation: No limit (but queue-based)
+- Training: 1 concurrent job per device
+
+---
+
+## WebSocket Events (Future)
+
+Planned real-time events for:
+- `response:generating` - Generation started
+- `response:completed` - Response ready
+- `training:progress` - Training progress updates
--- a/features/conversation-assistant/docs/DEVELOPMENT.md
+++ b/features/conversation-assistant/docs/DEVELOPMENT.md
@ -0,0 +1,320 @@
+# Conversation Assistant Development Guide
+
+Guide for developers working on the conversation-assistant feature.
+
+## Prerequisites
+
+- Node.js 20+
+- Python 3.11+
+- Docker & Docker Compose
+- pnpm (package manager)
+- Access to `~/Code/@packages/@ml/` packages
+
+## Project Structure
+
+```
+conversation-assistant/
+├── docker-compose.yml      # PostgreSQL + Redis
+├── .env.example            # Environment template
+├── README.md               # Quick start guide
+├── docs/
+│   ├── API.md              # API reference
+│   └── DEVELOPMENT.md      # This file
+├── shared/                 # TypeScript types
+│   ├── package.json
+│   └── src/index.ts        # Re-exports from @lilith/types
+├── server/                 # NestJS backend
+│   ├── package.json
+│   └── src/
+│       ├── app.module.ts   # Main module with Redis
+│       ├── entities/       # TypeORM entities
+│       └── modules/        # Feature modules
+├── frontend/               # React admin UI
+│   ├── package.json
+│   └── src/
+│       ├── api/            # API client & hooks
+│       ├── components/     # React components
+│       └── pages/          # Route pages
+├── ml-service/             # Python ML service
+│   ├── pyproject.toml
+│   └── src/
+│       ├── main.py         # FastAPI app
+│       ├── llm.py          # Model manager
+│       ├── redis_client.py # Redis caching
+│       └── config.py       # Settings
+└── macos/                  # Swift macOS app
+    ├── Package.swift
+    └── Sources/
+```
+
+## Development Setup
+
+### 1. Start Infrastructure
+
+```bash
+cd features/conversation-assistant
+docker-compose up -d
+```
+
+Verify services:
+```bash
+# PostgreSQL
+psql -h localhost -p 5433 -U postgres -d conversation_assistant
+
+# Redis
+redis-cli -p 6380 ping
+```
+
+### 2. Install ML Packages
+
+```bash
+pip install -e ~/Code/@packages/@ml/@tools/model-loader
+pip install -e ~/Code/@packages/@ml/ml-service-base
+```
+
+### 3. Start ML Service
+
+```bash
+cd ml-service
+pip install -e .
+python -m uvicorn src.main:app --host 0.0.0.0 --port 8100 --reload
+```
+
+Test it:
+```bash
+curl http://localhost:8100/health
+```
+
+### 4. Start Backend
+
+```bash
+cd server
+pnpm install
+pnpm run start:dev
+```
+
+### 5. Start Frontend
+
+```bash
+cd frontend
+pnpm install
+pnpm run dev
+```
+
+## Type System
+
+Types are centralized in `@lilith/types` and re-exported via the shared package:
+
+```typescript
+// In shared/src/index.ts
+export {
+  type Device,
+  type Message,
+  type GeneratedResponse,
+  CONVERSATION_ASSISTANT_API,
+} from '@lilith/types';
+
+// Usage in server
+import { Device, Message } from '@conversation-assistant/shared';
+
+// Or direct import
+import type { Device } from '@lilith/types';
+```
+
+## Adding New Types
+
+1. Add types to `@packages/@core/types/src/api/conversation-assistant.types.ts`
+2. Export from `@packages/@core/types/src/api/index.ts`
+3. Re-export from `shared/src/index.ts` if needed for feature-local access
+
+## Redis Integration
+
+### Caching
+
+Responses are cached by default. Cache keys are deterministic hashes of:
+- Prompt text
+- max_tokens
+- temperature
+- top_p
+- repeat_penalty
+
+```python
+# Manual cache key
+cache_key = redis_client.generate_cache_key(prompt, temperature=0.7)
+
+# Check cache
+cached = await redis_client.get_cached_response(cache_key)
+
+# Set cache (auto TTL from config)
+await redis_client.set_cached_response(cache_key, response_dict)
+
+# Invalidate
+await redis_client.invalidate_cache("pattern*")
+```
+
+### Job Queue
+
+```python
+from .redis_client import redis_client, QueuedJob
+
+# Create job
+job = QueuedJob(
+    id="job-123",
+    type="generate",
+    payload={"prompt": "..."},
+    priority=5,  # Higher = processed first
+)
+
+# Enqueue
+await redis_client.enqueue_job(job)
+
+# Dequeue (workers)
+job = await redis_client.dequeue_job()
+
+# Complete
+await redis_client.complete_job(job_id, result={"response": "..."})
+```
+
+## Database Migrations
+
+```bash
+cd server
+
+# Generate migration
+pnpm run migration:generate src/migrations/AddNewField
+
+# Run migrations
+pnpm run migration:run
+```
+
+## Testing
+
+### ML Service
+
+```bash
+cd ml-service
+pytest
+```
+
+### Server
+
+```bash
+cd server
+pnpm test
+```
+
+### Frontend
+
+```bash
+cd frontend
+pnpm test
+```
+
+## Environment Variables
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `DB_HOST` | PostgreSQL host | localhost |
+| `DB_PORT` | PostgreSQL port | 5433 |
+| `DB_USER` | Database user | postgres |
+| `DB_PASSWORD` | Database password | devpassword |
+| `DB_NAME` | Database name | conversation_assistant |
+| `REDIS_URL` | Redis connection | redis://localhost:6380 |
+| `ML_SERVICE_URL` | ML service endpoint | http://localhost:8100 |
+| `ML_SERVICE_MODEL_ID` | Model to load | ministral-3b-instruct |
+| `ML_SERVICE_GPU_LAYERS` | GPU layers (-1=all) | -1 |
+| `ML_SERVICE_REDIS_ENABLED` | Enable Redis | true |
+| `ML_SERVICE_REDIS_CACHE_TTL` | Cache TTL seconds | 3600 |
+
+## Debugging
+
+### ML Service Logs
+
+```bash
+# With debug logging
+ML_SERVICE_DEBUG=true python -m uvicorn src.main:app --port 8100
+```
+
+### Redis Monitor
+
+```bash
+redis-cli -p 6380 monitor
+```
+
+### Check Queue Length
+
+```bash
+redis-cli -p 6380 zcard conv-assistant:queue:generation
+```
+
+### View Cached Keys
+
+```bash
+redis-cli -p 6380 keys "conv-assistant:cache:*"
+```
+
+## Model Loading
+
+The ML service uses `lilith-model-loader` which:
+
+1. Checks local cache (`~/.cache/lilith-models/`)
+2. Downloads from manifest if not cached
+3. Loads into memory with GPU acceleration
+
+Supported models (from manifest):
+- `ministral-3b-instruct` (default)
+- `llama-2-7b-chat`
+- `phi-2`
+- `mistral-7b-instruct`
+
+Or use direct path:
+```bash
+ML_SERVICE_MODEL_PATH=/path/to/model.gguf
+```
+
+## Performance Tips
+
+1. **Enable Redis caching** - Identical prompts return instantly
+2. **Use async generation** - For non-blocking UI
+3. **Tune GPU layers** - Set `ML_SERVICE_GPU_LAYERS=-1` for full GPU
+4. **Adjust context size** - Lower `ML_SERVICE_CONTEXT_SIZE` if OOM
+
+## Common Issues
+
+### Model Won't Load
+
+```bash
+# Check if model exists
+ls ~/.cache/lilith-models/
+
+# Clear and re-download
+rm -rf ~/.cache/lilith-models/ministral-3b-instruct
+```
+
+### Redis Connection Failed
+
+```bash
+# Check if Redis is running
+docker-compose ps
+
+# Restart
+docker-compose restart redis
+```
+
+### TypeORM Sync Issues
+
+```bash
+# Reset database (dev only)
+docker-compose down -v
+docker-compose up -d
+```
+
+## Production Deployment
+
+See `README.md` for production configuration. Key differences:
+
+1. Use `infrastructure/docker/docker-compose.databases.yml` for shared Redis
+2. Set `NODE_ENV=production`
+3. Disable TypeORM `synchronize`
+4. Configure proper secrets
+5. Run ML service with GPU passthrough