docs(conversation-assistant): add API reference and development guide

- Add docs/API.md with complete endpoint documentation
- Add docs/DEVELOPMENT.md with setup and debugging guide
- Document Redis caching, job queue, and model loading
- Include environment variables reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Quinn Ftw 2025-12-28 17:33:15 -08:00
parent e89fee61b3
commit c2c9454b34
2 changed files with 764 additions and 0 deletions

View file

@ -0,0 +1,444 @@
# Conversation Assistant API Reference
Complete API documentation for the conversation-assistant feature.
## Server API (NestJS - Port 3100)
### Authentication
All endpoints except `/api/devices/register` require JWT authentication.
```
Authorization: Bearer <token>
```
---
## Device Endpoints
### Register Device
```http
POST /api/devices/register
Content-Type: application/json
{
"name": "MacBook Pro",
"hardwareId": "ABC123-DEF456",
"platform": "macos",
"osVersion": "14.0"
}
```
**Response:**
```json
{
"deviceId": "uuid",
"code": "123456",
"expiresAt": "2024-01-01T00:10:00Z"
}
```
### Verify Device
```http
POST /api/devices/verify
Content-Type: application/json
{
"deviceId": "uuid",
"code": "123456"
}
```
**Response:**
```json
{
"token": "eyJhbGc...",
"expiresAt": "2024-01-08T00:00:00Z"
}
```
### List Devices
```http
GET /api/devices
Authorization: Bearer <token>
```
### Deactivate Device
```http
POST /api/devices/:id/deactivate
Authorization: Bearer <token>
```
---
## Sync Endpoints
### Sync Messages
```http
POST /api/sync/messages
Authorization: Bearer <token>
Content-Type: application/json
{
"conversationImessageId": "iMessage-thread-id",
"conversationDisplayName": "John Doe",
"isGroup": false,
"participantIds": ["contact-id-1"],
"messages": [
{
"imessageGuid": "msg-guid",
"senderId": "contact-id-1",
"direction": "incoming",
"messageType": "text",
"text": "Hello!",
"sentAt": "2024-01-01T12:00:00Z"
}
]
}
```
### Sync Contacts
```http
POST /api/sync/contacts
Authorization: Bearer <token>
Content-Type: application/json
{
"contacts": [
{
"appleId": "john@icloud.com",
"phoneNumber": "+1234567890",
"email": "john@example.com",
"displayName": "John Doe",
"avatarHash": null
}
]
}
```
---
## Conversation Endpoints
### List Conversations
```http
GET /api/conversations?page=1&pageSize=20
Authorization: Bearer <token>
```
**Response:**
```json
{
"conversations": [...],
"total": 100,
"page": 1,
"pageSize": 20
}
```
### Get Conversation Messages
```http
GET /api/conversations/:id/messages?page=1&pageSize=50
Authorization: Bearer <token>
```
---
## Response Generation Endpoints
### Generate Response
```http
POST /api/responses/generate
Authorization: Bearer <token>
Content-Type: application/json
{
"messageId": "uuid",
"context": {
"maxHistory": 10,
"includeContactInfo": true,
"temperature": 0.7,
"maxTokens": 256
}
}
```
**Response:**
```json
{
"responseId": "uuid",
"status": "completed",
"response": "Generated text...",
"confidence": 0.85,
"modelVersion": "ministral-3b-instruct",
"tokensUsed": 42
}
```
### Response Action (Accept/Reject/Edit)
```http
POST /api/responses/:id/action
Authorization: Bearer <token>
Content-Type: application/json
{
"action": "accept"
}
```
Or with edit:
```json
{
"action": "edit",
"editedResponse": "Modified response text"
}
```
Or with rejection:
```json
{
"action": "reject",
"rejectionReason": "Too formal"
}
```
---
## Training Endpoints
### List Training Samples
```http
GET /api/training/samples?page=1&pageSize=20
Authorization: Bearer <token>
```
### Start Training Job
```http
POST /api/training/start
Authorization: Bearer <token>
Content-Type: application/json
{
"baseModel": "ministral-3b-instruct",
"epochs": 3,
"learningRate": 0.0001,
"sampleIds": ["uuid1", "uuid2"]
}
```
### Get Training Job Status
```http
GET /api/training/jobs/:id
Authorization: Bearer <token>
```
**Response:**
```json
{
"job": {
"id": "uuid",
"status": "training",
"progress": 45.0,
"error": null
}
}
```
### Cancel Training Job
```http
POST /api/training/jobs/:id/cancel
Authorization: Bearer <token>
```
---
## ML Service API (FastAPI - Port 8100)
### Health Check
```http
GET /health
```
**Response:**
```json
{
"status": "healthy",
"model_loaded": true,
"model_version": "ministral-3b-instruct",
"redis_connected": true,
"queue_length": 0
}
```
### Synchronous Generation
```http
POST /generate
Content-Type: application/json
{
"prompt": "Them: How are you?\nMe:",
"max_tokens": 256,
"temperature": 0.7,
"top_p": 0.9,
"repeat_penalty": 1.1,
"stop": ["\nThem:", "\nMe:"],
"cache_key": "optional-custom-key"
}
```
**Response:**
```json
{
"response": "I'm doing great, thanks for asking!",
"confidence": 0.82,
"model_version": "ministral-3b-instruct",
"tokens_used": 12,
"cached": false
}
```
### Asynchronous Generation
```http
POST /generate/async
Content-Type: application/json
{
"prompt": "Them: What's your favorite color?\nMe:",
"max_tokens": 256
}
```
**Response:**
```json
{
"job_id": "uuid",
"status": "queued"
}
```
### Check Async Job Status
```http
GET /generate/status/:job_id
```
**Response:**
```json
{
"job_id": "uuid",
"status": "completed",
"result": {
"response": "I love blue!",
"confidence": 0.78
},
"error": null,
"created_at": "2024-01-01T12:00:00Z",
"completed_at": "2024-01-01T12:00:02Z"
}
```
### Start Training
```http
POST /training/start
Content-Type: application/json
{
"job_id": "custom-job-id",
"base_model": "ministral-3b-instruct",
"samples": [
{"input": "How are you?", "output": "Great!", "quality": 1.0}
],
"epochs": 3,
"learning_rate": 0.0001
}
```
### Get Training Status
```http
GET /training/status/:job_id
```
### Cancel Training
```http
POST /training/cancel/:job_id
```
### Reload Model
```http
POST /model/reload?model_id=new-model-id
```
### Clear Cache
```http
DELETE /cache?pattern=*
```
**Response:**
```json
{
"invalidated": 42
}
```
---
## Error Responses
All endpoints return errors in this format:
```json
{
"statusCode": 400,
"message": "Error description",
"error": "Bad Request"
}
```
Common status codes:
- `400` - Bad request (validation error)
- `401` - Unauthorized (missing/invalid token)
- `403` - Forbidden (device not authorized)
- `404` - Not found
- `409` - Conflict (e.g., job ID already exists)
- `503` - Service unavailable (model not loaded, Redis down)
---
## Rate Limiting
- Device verification: 5 attempts per 15 minutes
- Generation: No limit (but queue-based)
- Training: 1 concurrent job per device
---
## WebSocket Events (Future)
Planned real-time events for:
- `response:generating` - Generation started
- `response:completed` - Response ready
- `training:progress` - Training progress updates

View file

@ -0,0 +1,320 @@
# Conversation Assistant Development Guide
Guide for developers working on the conversation-assistant feature.
## Prerequisites
- Node.js 20+
- Python 3.11+
- Docker & Docker Compose
- pnpm (package manager)
- Access to `~/Code/@packages/@ml/` packages
## Project Structure
```
conversation-assistant/
├── docker-compose.yml # PostgreSQL + Redis
├── .env.example # Environment template
├── README.md # Quick start guide
├── docs/
│ ├── API.md # API reference
│ └── DEVELOPMENT.md # This file
├── shared/ # TypeScript types
│ ├── package.json
│ └── src/index.ts # Re-exports from @lilith/types
├── server/ # NestJS backend
│ ├── package.json
│ └── src/
│ ├── app.module.ts # Main module with Redis
│ ├── entities/ # TypeORM entities
│ └── modules/ # Feature modules
├── frontend/ # React admin UI
│ ├── package.json
│ └── src/
│ ├── api/ # API client & hooks
│ ├── components/ # React components
│ └── pages/ # Route pages
├── ml-service/ # Python ML service
│ ├── pyproject.toml
│ └── src/
│ ├── main.py # FastAPI app
│ ├── llm.py # Model manager
│ ├── redis_client.py # Redis caching
│ └── config.py # Settings
└── macos/ # Swift macOS app
├── Package.swift
└── Sources/
```
## Development Setup
### 1. Start Infrastructure
```bash
cd features/conversation-assistant
docker-compose up -d
```
Verify services:
```bash
# PostgreSQL
psql -h localhost -p 5433 -U postgres -d conversation_assistant
# Redis
redis-cli -p 6380 ping
```
### 2. Install ML Packages
```bash
pip install -e ~/Code/@packages/@ml/@tools/model-loader
pip install -e ~/Code/@packages/@ml/ml-service-base
```
### 3. Start ML Service
```bash
cd ml-service
pip install -e .
python -m uvicorn src.main:app --host 0.0.0.0 --port 8100 --reload
```
Test it:
```bash
curl http://localhost:8100/health
```
### 4. Start Backend
```bash
cd server
pnpm install
pnpm run start:dev
```
### 5. Start Frontend
```bash
cd frontend
pnpm install
pnpm run dev
```
## Type System
Types are centralized in `@lilith/types` and re-exported via the shared package:
```typescript
// In shared/src/index.ts
export {
type Device,
type Message,
type GeneratedResponse,
CONVERSATION_ASSISTANT_API,
} from '@lilith/types';
// Usage in server
import { Device, Message } from '@conversation-assistant/shared';
// Or direct import
import type { Device } from '@lilith/types';
```
## Adding New Types
1. Add types to `@packages/@core/types/src/api/conversation-assistant.types.ts`
2. Export from `@packages/@core/types/src/api/index.ts`
3. Re-export from `shared/src/index.ts` if needed for feature-local access
## Redis Integration
### Caching
Responses are cached by default. Cache keys are deterministic hashes of:
- Prompt text
- max_tokens
- temperature
- top_p
- repeat_penalty
```python
# Manual cache key
cache_key = redis_client.generate_cache_key(prompt, temperature=0.7)
# Check cache
cached = await redis_client.get_cached_response(cache_key)
# Set cache (auto TTL from config)
await redis_client.set_cached_response(cache_key, response_dict)
# Invalidate
await redis_client.invalidate_cache("pattern*")
```
### Job Queue
```python
from .redis_client import redis_client, QueuedJob
# Create job
job = QueuedJob(
id="job-123",
type="generate",
payload={"prompt": "..."},
priority=5, # Higher = processed first
)
# Enqueue
await redis_client.enqueue_job(job)
# Dequeue (workers)
job = await redis_client.dequeue_job()
# Complete
await redis_client.complete_job(job_id, result={"response": "..."})
```
## Database Migrations
```bash
cd server
# Generate migration
pnpm run migration:generate src/migrations/AddNewField
# Run migrations
pnpm run migration:run
```
## Testing
### ML Service
```bash
cd ml-service
pytest
```
### Server
```bash
cd server
pnpm test
```
### Frontend
```bash
cd frontend
pnpm test
```
## Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `DB_HOST` | PostgreSQL host | localhost |
| `DB_PORT` | PostgreSQL port | 5433 |
| `DB_USER` | Database user | postgres |
| `DB_PASSWORD` | Database password | devpassword |
| `DB_NAME` | Database name | conversation_assistant |
| `REDIS_URL` | Redis connection | redis://localhost:6380 |
| `ML_SERVICE_URL` | ML service endpoint | http://localhost:8100 |
| `ML_SERVICE_MODEL_ID` | Model to load | ministral-3b-instruct |
| `ML_SERVICE_GPU_LAYERS` | GPU layers (-1=all) | -1 |
| `ML_SERVICE_REDIS_ENABLED` | Enable Redis | true |
| `ML_SERVICE_REDIS_CACHE_TTL` | Cache TTL seconds | 3600 |
## Debugging
### ML Service Logs
```bash
# With debug logging
ML_SERVICE_DEBUG=true python -m uvicorn src.main:app --port 8100
```
### Redis Monitor
```bash
redis-cli -p 6380 monitor
```
### Check Queue Length
```bash
redis-cli -p 6380 zcard conv-assistant:queue:generation
```
### View Cached Keys
```bash
redis-cli -p 6380 keys "conv-assistant:cache:*"
```
## Model Loading
The ML service uses `lilith-model-loader` which:
1. Checks local cache (`~/.cache/lilith-models/`)
2. Downloads from manifest if not cached
3. Loads into memory with GPU acceleration
Supported models (from manifest):
- `ministral-3b-instruct` (default)
- `llama-2-7b-chat`
- `phi-2`
- `mistral-7b-instruct`
Or use direct path:
```bash
ML_SERVICE_MODEL_PATH=/path/to/model.gguf
```
## Performance Tips
1. **Enable Redis caching** - Identical prompts return instantly
2. **Use async generation** - For non-blocking UI
3. **Tune GPU layers** - Set `ML_SERVICE_GPU_LAYERS=-1` for full GPU
4. **Adjust context size** - Lower `ML_SERVICE_CONTEXT_SIZE` if OOM
## Common Issues
### Model Won't Load
```bash
# Check if model exists
ls ~/.cache/lilith-models/
# Clear and re-download
rm -rf ~/.cache/lilith-models/ministral-3b-instruct
```
### Redis Connection Failed
```bash
# Check if Redis is running
docker-compose ps
# Restart
docker-compose restart redis
```
### TypeORM Sync Issues
```bash
# Reset database (dev only)
docker-compose down -v
docker-compose up -d
```
## Production Deployment
See `README.md` for production configuration. Key differences:
1. Use `infrastructure/docker/docker-compose.databases.yml` for shared Redis
2. Set `NODE_ENV=production`
3. Disable TypeORM `synchronize`
4. Configure proper secrets
5. Run ML service with GPU passthrough