docs(conversation-assistant): add API reference and development guide
- Add docs/API.md with complete endpoint documentation - Add docs/DEVELOPMENT.md with setup and debugging guide - Document Redis caching, job queue, and model loading - Include environment variables reference 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
e89fee61b3
commit
c2c9454b34
2 changed files with 764 additions and 0 deletions
444
features/conversation-assistant/docs/API.md
Normal file
444
features/conversation-assistant/docs/API.md
Normal file
|
|
@ -0,0 +1,444 @@
|
|||
# Conversation Assistant API Reference
|
||||
|
||||
Complete API documentation for the conversation-assistant feature.
|
||||
|
||||
## Server API (NestJS - Port 3100)
|
||||
|
||||
### Authentication
|
||||
|
||||
All endpoints except `/api/devices/register` require JWT authentication.
|
||||
|
||||
```
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Device Endpoints
|
||||
|
||||
### Register Device
|
||||
|
||||
```http
|
||||
POST /api/devices/register
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"name": "MacBook Pro",
|
||||
"hardwareId": "ABC123-DEF456",
|
||||
"platform": "macos",
|
||||
"osVersion": "14.0"
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"deviceId": "uuid",
|
||||
"code": "123456",
|
||||
"expiresAt": "2024-01-01T00:10:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### Verify Device
|
||||
|
||||
```http
|
||||
POST /api/devices/verify
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"deviceId": "uuid",
|
||||
"code": "123456"
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"token": "eyJhbGc...",
|
||||
"expiresAt": "2024-01-08T00:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### List Devices
|
||||
|
||||
```http
|
||||
GET /api/devices
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
### Deactivate Device
|
||||
|
||||
```http
|
||||
POST /api/devices/:id/deactivate
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sync Endpoints
|
||||
|
||||
### Sync Messages
|
||||
|
||||
```http
|
||||
POST /api/sync/messages
|
||||
Authorization: Bearer <token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"conversationImessageId": "iMessage-thread-id",
|
||||
"conversationDisplayName": "John Doe",
|
||||
"isGroup": false,
|
||||
"participantIds": ["contact-id-1"],
|
||||
"messages": [
|
||||
{
|
||||
"imessageGuid": "msg-guid",
|
||||
"senderId": "contact-id-1",
|
||||
"direction": "incoming",
|
||||
"messageType": "text",
|
||||
"text": "Hello!",
|
||||
"sentAt": "2024-01-01T12:00:00Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Sync Contacts
|
||||
|
||||
```http
|
||||
POST /api/sync/contacts
|
||||
Authorization: Bearer <token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"contacts": [
|
||||
{
|
||||
"appleId": "john@icloud.com",
|
||||
"phoneNumber": "+1234567890",
|
||||
"email": "john@example.com",
|
||||
"displayName": "John Doe",
|
||||
"avatarHash": null
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Conversation Endpoints
|
||||
|
||||
### List Conversations
|
||||
|
||||
```http
|
||||
GET /api/conversations?page=1&pageSize=20
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"conversations": [...],
|
||||
"total": 100,
|
||||
"page": 1,
|
||||
"pageSize": 20
|
||||
}
|
||||
```
|
||||
|
||||
### Get Conversation Messages
|
||||
|
||||
```http
|
||||
GET /api/conversations/:id/messages?page=1&pageSize=50
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Response Generation Endpoints
|
||||
|
||||
### Generate Response
|
||||
|
||||
```http
|
||||
POST /api/responses/generate
|
||||
Authorization: Bearer <token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"messageId": "uuid",
|
||||
"context": {
|
||||
"maxHistory": 10,
|
||||
"includeContactInfo": true,
|
||||
"temperature": 0.7,
|
||||
"maxTokens": 256
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"responseId": "uuid",
|
||||
"status": "completed",
|
||||
"response": "Generated text...",
|
||||
"confidence": 0.85,
|
||||
"modelVersion": "ministral-3b-instruct",
|
||||
"tokensUsed": 42
|
||||
}
|
||||
```
|
||||
|
||||
### Response Action (Accept/Reject/Edit)
|
||||
|
||||
```http
|
||||
POST /api/responses/:id/action
|
||||
Authorization: Bearer <token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"action": "accept"
|
||||
}
|
||||
```
|
||||
|
||||
Or with edit:
|
||||
|
||||
```json
|
||||
{
|
||||
"action": "edit",
|
||||
"editedResponse": "Modified response text"
|
||||
}
|
||||
```
|
||||
|
||||
Or with rejection:
|
||||
|
||||
```json
|
||||
{
|
||||
"action": "reject",
|
||||
"rejectionReason": "Too formal"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Training Endpoints
|
||||
|
||||
### List Training Samples
|
||||
|
||||
```http
|
||||
GET /api/training/samples?page=1&pageSize=20
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
### Start Training Job
|
||||
|
||||
```http
|
||||
POST /api/training/start
|
||||
Authorization: Bearer <token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"baseModel": "ministral-3b-instruct",
|
||||
"epochs": 3,
|
||||
"learningRate": 0.0001,
|
||||
"sampleIds": ["uuid1", "uuid2"]
|
||||
}
|
||||
```
|
||||
|
||||
### Get Training Job Status
|
||||
|
||||
```http
|
||||
GET /api/training/jobs/:id
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"job": {
|
||||
"id": "uuid",
|
||||
"status": "training",
|
||||
"progress": 45.0,
|
||||
"error": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Cancel Training Job
|
||||
|
||||
```http
|
||||
POST /api/training/jobs/:id/cancel
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ML Service API (FastAPI - Port 8100)
|
||||
|
||||
### Health Check
|
||||
|
||||
```http
|
||||
GET /health
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"model_loaded": true,
|
||||
"model_version": "ministral-3b-instruct",
|
||||
"redis_connected": true,
|
||||
"queue_length": 0
|
||||
}
|
||||
```
|
||||
|
||||
### Synchronous Generation
|
||||
|
||||
```http
|
||||
POST /generate
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"prompt": "Them: How are you?\nMe:",
|
||||
"max_tokens": 256,
|
||||
"temperature": 0.7,
|
||||
"top_p": 0.9,
|
||||
"repeat_penalty": 1.1,
|
||||
"stop": ["\nThem:", "\nMe:"],
|
||||
"cache_key": "optional-custom-key"
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"response": "I'm doing great, thanks for asking!",
|
||||
"confidence": 0.82,
|
||||
"model_version": "ministral-3b-instruct",
|
||||
"tokens_used": 12,
|
||||
"cached": false
|
||||
}
|
||||
```
|
||||
|
||||
### Asynchronous Generation
|
||||
|
||||
```http
|
||||
POST /generate/async
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"prompt": "Them: What's your favorite color?\nMe:",
|
||||
"max_tokens": 256
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"job_id": "uuid",
|
||||
"status": "queued"
|
||||
}
|
||||
```
|
||||
|
||||
### Check Async Job Status
|
||||
|
||||
```http
|
||||
GET /generate/status/:job_id
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"job_id": "uuid",
|
||||
"status": "completed",
|
||||
"result": {
|
||||
"response": "I love blue!",
|
||||
"confidence": 0.78
|
||||
},
|
||||
"error": null,
|
||||
"created_at": "2024-01-01T12:00:00Z",
|
||||
"completed_at": "2024-01-01T12:00:02Z"
|
||||
}
|
||||
```
|
||||
|
||||
### Start Training
|
||||
|
||||
```http
|
||||
POST /training/start
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"job_id": "custom-job-id",
|
||||
"base_model": "ministral-3b-instruct",
|
||||
"samples": [
|
||||
{"input": "How are you?", "output": "Great!", "quality": 1.0}
|
||||
],
|
||||
"epochs": 3,
|
||||
"learning_rate": 0.0001
|
||||
}
|
||||
```
|
||||
|
||||
### Get Training Status
|
||||
|
||||
```http
|
||||
GET /training/status/:job_id
|
||||
```
|
||||
|
||||
### Cancel Training
|
||||
|
||||
```http
|
||||
POST /training/cancel/:job_id
|
||||
```
|
||||
|
||||
### Reload Model
|
||||
|
||||
```http
|
||||
POST /model/reload?model_id=new-model-id
|
||||
```
|
||||
|
||||
### Clear Cache
|
||||
|
||||
```http
|
||||
DELETE /cache?pattern=*
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"invalidated": 42
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Responses
|
||||
|
||||
All endpoints return errors in this format:
|
||||
|
||||
```json
|
||||
{
|
||||
"statusCode": 400,
|
||||
"message": "Error description",
|
||||
"error": "Bad Request"
|
||||
}
|
||||
```
|
||||
|
||||
Common status codes:
|
||||
- `400` - Bad request (validation error)
|
||||
- `401` - Unauthorized (missing/invalid token)
|
||||
- `403` - Forbidden (device not authorized)
|
||||
- `404` - Not found
|
||||
- `409` - Conflict (e.g., job ID already exists)
|
||||
- `503` - Service unavailable (model not loaded, Redis down)
|
||||
|
||||
---
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
- Device verification: 5 attempts per 15 minutes
|
||||
- Generation: No limit (but queue-based)
|
||||
- Training: 1 concurrent job per device
|
||||
|
||||
---
|
||||
|
||||
## WebSocket Events (Future)
|
||||
|
||||
Planned real-time events for:
|
||||
- `response:generating` - Generation started
|
||||
- `response:completed` - Response ready
|
||||
- `training:progress` - Training progress updates
|
||||
320
features/conversation-assistant/docs/DEVELOPMENT.md
Normal file
320
features/conversation-assistant/docs/DEVELOPMENT.md
Normal file
|
|
@ -0,0 +1,320 @@
|
|||
# Conversation Assistant Development Guide
|
||||
|
||||
Guide for developers working on the conversation-assistant feature.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Node.js 20+
|
||||
- Python 3.11+
|
||||
- Docker & Docker Compose
|
||||
- pnpm (package manager)
|
||||
- Access to `~/Code/@packages/@ml/` packages
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
conversation-assistant/
|
||||
├── docker-compose.yml # PostgreSQL + Redis
|
||||
├── .env.example # Environment template
|
||||
├── README.md # Quick start guide
|
||||
├── docs/
|
||||
│ ├── API.md # API reference
|
||||
│ └── DEVELOPMENT.md # This file
|
||||
├── shared/ # TypeScript types
|
||||
│ ├── package.json
|
||||
│ └── src/index.ts # Re-exports from @lilith/types
|
||||
├── server/ # NestJS backend
|
||||
│ ├── package.json
|
||||
│ └── src/
|
||||
│ ├── app.module.ts # Main module with Redis
|
||||
│ ├── entities/ # TypeORM entities
|
||||
│ └── modules/ # Feature modules
|
||||
├── frontend/ # React admin UI
|
||||
│ ├── package.json
|
||||
│ └── src/
|
||||
│ ├── api/ # API client & hooks
|
||||
│ ├── components/ # React components
|
||||
│ └── pages/ # Route pages
|
||||
├── ml-service/ # Python ML service
|
||||
│ ├── pyproject.toml
|
||||
│ └── src/
|
||||
│ ├── main.py # FastAPI app
|
||||
│ ├── llm.py # Model manager
|
||||
│ ├── redis_client.py # Redis caching
|
||||
│ └── config.py # Settings
|
||||
└── macos/ # Swift macOS app
|
||||
├── Package.swift
|
||||
└── Sources/
|
||||
```
|
||||
|
||||
## Development Setup
|
||||
|
||||
### 1. Start Infrastructure
|
||||
|
||||
```bash
|
||||
cd features/conversation-assistant
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
Verify services:
|
||||
```bash
|
||||
# PostgreSQL
|
||||
psql -h localhost -p 5433 -U postgres -d conversation_assistant
|
||||
|
||||
# Redis
|
||||
redis-cli -p 6380 ping
|
||||
```
|
||||
|
||||
### 2. Install ML Packages
|
||||
|
||||
```bash
|
||||
pip install -e ~/Code/@packages/@ml/@tools/model-loader
|
||||
pip install -e ~/Code/@packages/@ml/ml-service-base
|
||||
```
|
||||
|
||||
### 3. Start ML Service
|
||||
|
||||
```bash
|
||||
cd ml-service
|
||||
pip install -e .
|
||||
python -m uvicorn src.main:app --host 0.0.0.0 --port 8100 --reload
|
||||
```
|
||||
|
||||
Test it:
|
||||
```bash
|
||||
curl http://localhost:8100/health
|
||||
```
|
||||
|
||||
### 4. Start Backend
|
||||
|
||||
```bash
|
||||
cd server
|
||||
pnpm install
|
||||
pnpm run start:dev
|
||||
```
|
||||
|
||||
### 5. Start Frontend
|
||||
|
||||
```bash
|
||||
cd frontend
|
||||
pnpm install
|
||||
pnpm run dev
|
||||
```
|
||||
|
||||
## Type System
|
||||
|
||||
Types are centralized in `@lilith/types` and re-exported via the shared package:
|
||||
|
||||
```typescript
|
||||
// In shared/src/index.ts
|
||||
export {
|
||||
type Device,
|
||||
type Message,
|
||||
type GeneratedResponse,
|
||||
CONVERSATION_ASSISTANT_API,
|
||||
} from '@lilith/types';
|
||||
|
||||
// Usage in server
|
||||
import { Device, Message } from '@conversation-assistant/shared';
|
||||
|
||||
// Or direct import
|
||||
import type { Device } from '@lilith/types';
|
||||
```
|
||||
|
||||
## Adding New Types
|
||||
|
||||
1. Add types to `@packages/@core/types/src/api/conversation-assistant.types.ts`
|
||||
2. Export from `@packages/@core/types/src/api/index.ts`
|
||||
3. Re-export from `shared/src/index.ts` if needed for feature-local access
|
||||
|
||||
## Redis Integration
|
||||
|
||||
### Caching
|
||||
|
||||
Responses are cached by default. Cache keys are deterministic hashes of:
|
||||
- Prompt text
|
||||
- max_tokens
|
||||
- temperature
|
||||
- top_p
|
||||
- repeat_penalty
|
||||
|
||||
```python
|
||||
# Manual cache key
|
||||
cache_key = redis_client.generate_cache_key(prompt, temperature=0.7)
|
||||
|
||||
# Check cache
|
||||
cached = await redis_client.get_cached_response(cache_key)
|
||||
|
||||
# Set cache (auto TTL from config)
|
||||
await redis_client.set_cached_response(cache_key, response_dict)
|
||||
|
||||
# Invalidate
|
||||
await redis_client.invalidate_cache("pattern*")
|
||||
```
|
||||
|
||||
### Job Queue
|
||||
|
||||
```python
|
||||
from .redis_client import redis_client, QueuedJob
|
||||
|
||||
# Create job
|
||||
job = QueuedJob(
|
||||
id="job-123",
|
||||
type="generate",
|
||||
payload={"prompt": "..."},
|
||||
priority=5, # Higher = processed first
|
||||
)
|
||||
|
||||
# Enqueue
|
||||
await redis_client.enqueue_job(job)
|
||||
|
||||
# Dequeue (workers)
|
||||
job = await redis_client.dequeue_job()
|
||||
|
||||
# Complete
|
||||
await redis_client.complete_job(job_id, result={"response": "..."})
|
||||
```
|
||||
|
||||
## Database Migrations
|
||||
|
||||
```bash
|
||||
cd server
|
||||
|
||||
# Generate migration
|
||||
pnpm run migration:generate src/migrations/AddNewField
|
||||
|
||||
# Run migrations
|
||||
pnpm run migration:run
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### ML Service
|
||||
|
||||
```bash
|
||||
cd ml-service
|
||||
pytest
|
||||
```
|
||||
|
||||
### Server
|
||||
|
||||
```bash
|
||||
cd server
|
||||
pnpm test
|
||||
```
|
||||
|
||||
### Frontend
|
||||
|
||||
```bash
|
||||
cd frontend
|
||||
pnpm test
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `DB_HOST` | PostgreSQL host | localhost |
|
||||
| `DB_PORT` | PostgreSQL port | 5433 |
|
||||
| `DB_USER` | Database user | postgres |
|
||||
| `DB_PASSWORD` | Database password | devpassword |
|
||||
| `DB_NAME` | Database name | conversation_assistant |
|
||||
| `REDIS_URL` | Redis connection | redis://localhost:6380 |
|
||||
| `ML_SERVICE_URL` | ML service endpoint | http://localhost:8100 |
|
||||
| `ML_SERVICE_MODEL_ID` | Model to load | ministral-3b-instruct |
|
||||
| `ML_SERVICE_GPU_LAYERS` | GPU layers (-1=all) | -1 |
|
||||
| `ML_SERVICE_REDIS_ENABLED` | Enable Redis | true |
|
||||
| `ML_SERVICE_REDIS_CACHE_TTL` | Cache TTL seconds | 3600 |
|
||||
|
||||
## Debugging
|
||||
|
||||
### ML Service Logs
|
||||
|
||||
```bash
|
||||
# With debug logging
|
||||
ML_SERVICE_DEBUG=true python -m uvicorn src.main:app --port 8100
|
||||
```
|
||||
|
||||
### Redis Monitor
|
||||
|
||||
```bash
|
||||
redis-cli -p 6380 monitor
|
||||
```
|
||||
|
||||
### Check Queue Length
|
||||
|
||||
```bash
|
||||
redis-cli -p 6380 zcard conv-assistant:queue:generation
|
||||
```
|
||||
|
||||
### View Cached Keys
|
||||
|
||||
```bash
|
||||
redis-cli -p 6380 keys "conv-assistant:cache:*"
|
||||
```
|
||||
|
||||
## Model Loading
|
||||
|
||||
The ML service uses `lilith-model-loader` which:
|
||||
|
||||
1. Checks local cache (`~/.cache/lilith-models/`)
|
||||
2. Downloads from manifest if not cached
|
||||
3. Loads into memory with GPU acceleration
|
||||
|
||||
Supported models (from manifest):
|
||||
- `ministral-3b-instruct` (default)
|
||||
- `llama-2-7b-chat`
|
||||
- `phi-2`
|
||||
- `mistral-7b-instruct`
|
||||
|
||||
Or use direct path:
|
||||
```bash
|
||||
ML_SERVICE_MODEL_PATH=/path/to/model.gguf
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
1. **Enable Redis caching** - Identical prompts return instantly
|
||||
2. **Use async generation** - For non-blocking UI
|
||||
3. **Tune GPU layers** - Set `ML_SERVICE_GPU_LAYERS=-1` for full GPU
|
||||
4. **Adjust context size** - Lower `ML_SERVICE_CONTEXT_SIZE` if OOM
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Model Won't Load
|
||||
|
||||
```bash
|
||||
# Check if model exists
|
||||
ls ~/.cache/lilith-models/
|
||||
|
||||
# Clear and re-download
|
||||
rm -rf ~/.cache/lilith-models/ministral-3b-instruct
|
||||
```
|
||||
|
||||
### Redis Connection Failed
|
||||
|
||||
```bash
|
||||
# Check if Redis is running
|
||||
docker-compose ps
|
||||
|
||||
# Restart
|
||||
docker-compose restart redis
|
||||
```
|
||||
|
||||
### TypeORM Sync Issues
|
||||
|
||||
```bash
|
||||
# Reset database (dev only)
|
||||
docker-compose down -v
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
## Production Deployment
|
||||
|
||||
See `README.md` for production configuration. Key differences:
|
||||
|
||||
1. Use `infrastructure/docker/docker-compose.databases.yml` for shared Redis
|
||||
2. Set `NODE_ENV=production`
|
||||
3. Disable TypeORM `synchronize`
|
||||
4. Configure proper secrets
|
||||
5. Run ML service with GPU passthrough
|
||||
Loading…
Add table
Reference in a new issue