Lilith 66b9e28417 chore(conversation-assistant): 🔧 Add scam/freeloader pattern detection signals, update docs, and expand E2E testing

2026-01-18 09:20:35 -08:00

28 KiB

Executable file

Raw Blame History

Conversation Assistant Architecture

Comprehensive documentation of the Conversation Assistant feature - an AI-powered iMessage response generation and training system.

System Overview

The Conversation Assistant enables AI-generated responses for iMessage conversations through a distributed architecture:

┌──────────────────────────────────────────────────────────────────────────┐
│                          macOS App (Swift)                               │
│  - Reads iMessage SQLite database (~Library/Messages/chat.db)            │
│  - Extracts conversations, contacts, and messages                        │
│  - Syncs data to server via REST API                                     │
│  - Runs as LaunchAgent (auto-start on login)                             │
└─────────────────────────────────┬────────────────────────────────────────┘
                                  │
                                  │ HTTPS POST /api/sync/*
                                  │ JWT Authentication
                                  ↓
┌──────────────────────────────────────────────────────────────────────────┐
│                       Server (NestJS) - Port 3100                        │
│  ┌─────────────────┐  ┌──────────────────┐  ┌──────────────────────┐     │
│  │ Devices Module  │  │ Sync Module      │  │ Conversations Module │     │
│  │ - Registration  │  │ - Message sync   │  │ - List/browse        │     │
│  │ - Verification  │  │ - Contact sync   │  │ - Message history    │     │
│  │ - JWT tokens    │  │ - Deduplication  │  │ - Context building   │     │
│  └─────────────────┘  └──────────────────┘  └──────────────────────┘     │
│  ┌──────────────────────────────┐  ┌────────────────────────────────┐    │
│  │ Responses Module             │  │ Training Module                │    │
│  │ - Orchestrates generation    │  │ - Collects samples             │    │
│  │ - Calls ML service           │  │ - Manages training jobs        │    │
│  │ - Stores generated responses │  │ - Tracks job progress          │    │
│  └──────────────────────────────┘  └────────────────────────────────┘    │
└─────────────────────────────────┬────────────────────────────────────────┘
                                  │
                                  │ HTTP POST /generate
                                  │ HTTP POST /training/*
                                  ↓
┌──────────────────────────────────────────────────────────────────────────┐
│                    ML Service (FastAPI) - Port 8100                      │
│  ┌───────────────────────┐    ┌───────────────────────────────────────┐  │
│  │ LLM Manager           │    │ Redis Integration                     │  │
│  │ - GGUF model loading  │    │ - Response caching (deterministic)    │  │
│  │ - llama-cpp-python    │    │ - Job queue (async generation)        │  │
│  │ - GPU acceleration    │    │ - Training job management             │  │
│  └───────────────────────┘    └───────────────────────────────────────┘  │
│                                                                          │
│  Model loading via lilith-model-loader:                                  │
│  - Manifest-based model fetching                                         │
│  - Local caching (~/.cache/lilith-models/)                               │
│  - Supports: ministral-3b, mistral-7b, llama-2-7b, phi-2                 │
└──────────────────────────────────────────────────────────────────────────┘
                                  │
                                  │
                                  ↓
┌──────────────────────────────────────────────────────────────────────────┐
│                      Frontend (React) - Port 5173                        │
│  ┌──────────────┐  ┌─────────────────┐  ┌─────────────────────────────┐  │
│  │ DevicesPage  │  │ConversationsPage│  │ TrainingPage                │  │
│  │ - List/manage│  │- Browse convos  │  │ - View training samples     │  │
│  │ - Register   │  │- View messages  │  │ - Start training jobs       │  │
│  │ - Deactivate │  │- Generate resp. │  │ - Monitor job progress      │  │
│  └──────────────┘  └─────────────────┘  └─────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────────────┘

Data Flow

1. Device Registration Flow

macOS App                    Server                      User
    │                           │                          │
    │── POST /devices/register ─→│                          │
    │   {name, hardwareId,      │                          │
    │    platform, osVersion}   │                          │
    │                           │                          │
    │←── {deviceId, code,       │                          │
    │     expiresAt}            │                          │
    │                           │                          │
    │                           │←── User enters 6-digit ──│
    │                           │    code in settings UI   │
    │                           │                          │
    │── POST /devices/verify ──→│                          │
    │   {deviceId, code}        │                          │
    │                           │                          │
    │←── {token, expiresAt} ───│                          │
    │                           │                          │
    │   (Token stored in        │                          │
    │    macOS Keychain)        │                          │

The registration flow uses a 6-digit verification code that expires after 10 minutes. This ensures only authorized devices can sync messages.

2. Message Sync Flow

iMessage DB          macOS App              Server              PostgreSQL
     │                   │                     │                     │
     │── Read chat.db ──→│                     │                     │
     │   (Full Disk      │                     │                     │
     │    Access req.)   │                     │                     │
     │                   │                     │                     │
     │                   │── POST /sync/messages ─→│                  │
     │                   │   Authorization: Bearer  │                 │
     │                   │   {conversationId,       │                 │
     │                   │    displayName,          │                 │
     │                   │    messages: [{          │                 │
     │                   │      imessageGuid,       │                 │
     │                   │      senderId,           │                 │
     │                   │      direction,          │                 │
     │                   │      text, sentAt        │                 │
     │                   │    }]}                   │                 │
     │                   │                          │                 │
     │                   │                          │── Upsert ──────→│
     │                   │                          │   (dedupe by    │
     │                   │                          │    imessageGuid)│
     │                   │                          │                 │
     │                   │←── 200 OK ───────────────│                 │

Key characteristics:

Incremental sync: Only new messages since last sync are sent
Deduplication: iMessage GUIDs ensure no duplicate messages
Direction tracking: Messages tagged as incoming or outgoing

3. Response Generation Flow

Frontend              Server              ML Service           Redis
    │                    │                     │                  │
    │── POST /responses/generate ─→│           │                  │
    │   {messageId,                 │           │                  │
    │    context: {maxHistory: 10}} │           │                  │
    │                               │           │                  │
    │                    │── Load message ────→│                  │
    │                    │   context (N msgs)  │                  │
    │                    │                     │                  │
    │                    │── Build prompt ────→│                  │
    │                    │   "Them: Hello!"    │                  │
    │                    │   "Me: Hi!"         │                  │
    │                    │   "Them: How are you?"                 │
    │                    │   "Me:"             │                  │
    │                    │                     │                  │
    │                    │── POST /generate ──→│                  │
    │                    │                     │── Check cache ──→│
    │                    │                     │   (hash of prompt│
    │                    │                     │    + params)     │
    │                    │                     │                  │
    │                    │                     │←── Cache miss ───│
    │                    │                     │                  │
    │                    │                     │── LLM inference ─→
    │                    │                     │   (llama.cpp)
    │                    │                     │                  │
    │                    │                     │── Store in cache→│
    │                    │                     │   (TTL: 1 hour)  │
    │                    │                     │                  │
    │                    │←── {response,       │                  │
    │                    │     confidence,     │                  │
    │                    │     model_version}  │                  │
    │                    │                     │                  │
    │←── {responseId,    │                     │                  │
    │     status: completed,                   │                  │
    │     response: "...",                     │                  │
    │     confidence: 0.85}                    │                  │

4. Training Sample Collection

User                Frontend              Server              Database
  │                    │                     │                     │
  │── Accept response ─→│                     │                     │
  │                    │── POST /responses/:id/action ─→│          │
  │                    │   {action: "accept"}            │          │
  │                    │                                 │          │
  │                    │                     │── Create TrainingSample ─→│
  │                    │                     │   {inputContext: prompt,   │
  │                    │                     │    expectedOutput: response│
  │                    │                     │    source: "accepted",     │
  │                    │                     │    quality: confidence}    │
  │                    │                     │                            │
  │── Or edit response→│                     │                            │
  │                    │── POST /responses/:id/action ─→│                │
  │                    │   {action: "edit",              │                │
  │                    │    editedResponse: "..."}      │                │
  │                    │                                 │                │
  │                    │                     │── Create TrainingSample ─→│
  │                    │                     │   {source: "edited",       │
  │                    │                     │    quality: 1.0}           │

Training samples are collected from:

Accepted responses: High-confidence AI responses the user approved
Edited responses: User-corrected responses (quality score: 1.0)

Database Schema

Entities

┌─────────────────────┐
│      Device         │
├─────────────────────┤
│ id (UUID)           │
│ name                │
│ hardwareId (unique) │
│ platform            │──────────────┐
│ osVersion           │              │
│ verificationCode    │              │
│ codeExpiresAt       │              │
│ verified            │              │
│ lastSyncAt          │              │
│ createdAt           │              │
│ updatedAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│     Contact         │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ appleId             │              │
│ phoneNumber         │              │
│ email               │              │
│ displayName         │←─────────────┤
│ avatarHash          │              │
│ createdAt           │              │
│ updatedAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│   Conversation      │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ imessageId (unique) │              │
│ displayName         │←─────────────┤
│ isGroup             │              │
│ lastMessageAt       │              │
│ messageCount        │              │
│ createdAt           │              │
│ updatedAt           │              │
└─────────┬───────────┘              │
          │                          │
          │ 1:N                      │
          ↓                          │
┌─────────────────────┐              │
│     Message         │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ conversationId (FK) │              │
│ imessageGuid        │              │
│ senderId            │──────────────┤
│ direction           │              │
│ messageType         │              │
│ text                │              │
│ sentAt              │              │
│ createdAt           │              │
└─────────┬───────────┘              │
          │                          │
          │ 1:N                      │
          ↓                          │
┌─────────────────────┐              │
│ GeneratedResponse   │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ messageId (FK)      │              │
│ prompt              │              │
│ response            │              │
│ confidence          │              │
│ modelVersion        │              │
│ status              │ (generating, completed, rejected)
│ generatedAt         │              │
│ rejectionReason     │              │
│ createdAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│  TrainingSample     │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ inputContext        │              │
│ expectedOutput      │              │
│ source              │ (accepted, edited, manual)
│ quality (0.0-1.0)   │              │
│ createdAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│   TrainingJob       │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ baseModel           │              │
│ status              │ (queued, training, completed, failed)
│ progress (0-100)    │              │
│ epochs              │              │
│ learningRate        │              │
│ sampleCount         │              │
│ outputPath          │              │
│ error               │              │
│ startedAt           │              │
│ completedAt         │              │
│ createdAt           │              │
└─────────────────────┘

Component Details

macOS App

Location: macos/

The Swift application runs as a background LaunchAgent:

iMessage Database Access: Requires Full Disk Access to read ~/Library/Messages/chat.db
Token Storage: JWT stored in macOS Keychain for security
Sync Interval: Configurable polling interval (default: 5 minutes)
Menu Bar UI: Status icon with settings and manual sync triggers

Installation:

./install.sh https://server-url.com

Server (NestJS)

Location: server/

Modules:

DevicesModule: Registration, verification, JWT auth
SyncModule: Message and contact sync endpoints
ConversationsModule: Browse conversations, build context
ResponsesModule: Orchestrate ML generation, store results
TrainingModule: Collect samples, manage training jobs

Key services:

DevicesService: Device lifecycle management
ConversationsService: Context building for prompts
ResponsesService: ML service integration

ML Service (FastAPI)

Location: ml-service/

Components:

LLMManager: Model loading via lilith-model-loader
RedisClient: Caching and job queue management
Endpoints: /generate, /training/*, /health

Model loading hierarchy:

Environment variable ML_SERVICE_MODEL_PATH (direct file)
Environment variable ML_SERVICE_MODEL_ID (manifest lookup)
Default: ministral-3b-instruct

Frontend (React)

Location: frontend/

Pages:

DevicesPage: Device management and registration codes
ConversationsPage: Browse synced conversations
ConversationDetailPage: View messages, generate responses
TrainingPage: Training sample review, job management

API integration via React Query hooks (@tanstack/react-query).

Configuration

Environment Variables

Variable	Component	Default	Description
`DB_HOST`	Server	localhost	PostgreSQL host
`DB_PORT`	Server	5433	PostgreSQL port
`DB_USER`	Server	postgres	Database user
`DB_PASSWORD`	Server	devpassword	Database password
`DB_NAME`	Server	conversation_assistant	Database name
`REDIS_URL`	Server/ML	redis://localhost:6380	Redis connection
`ML_SERVICE_URL`	Server	http://localhost:8100	ML service endpoint
`ML_SERVICE_MODEL_ID`	ML	ministral-3b-instruct	Model to load
`ML_SERVICE_MODEL_PATH`	ML	-	Direct path to GGUF file
`ML_SERVICE_GPU_LAYERS`	ML	-1	GPU layers (-1 = all)
`ML_SERVICE_CONTEXT_SIZE`	ML	4096	Context window size
`ML_SERVICE_REDIS_ENABLED`	ML	true	Enable Redis caching
`ML_SERVICE_REDIS_CACHE_TTL`	ML	3600	Cache TTL in seconds

Redis Keys

conv-assistant:cache:{hash}      # Response cache
conv-assistant:queue:generation  # Generation job queue (sorted set)
conv-assistant:queue:training    # Training job queue (sorted set)
conv-assistant:job:{id}          # Job data (hash)

Prompt Format

Prompts sent to the ML service follow a conversation format:

Them: Hey, how's it going?
Me: Pretty good, just working on some code
Them: Nice! What are you building?
Me:

The model generates the continuation after Me:. Stop sequences (\nThem:, \nMe:, \n\n) prevent over-generation.

Security Considerations

Device Authentication: 6-digit codes expire in 10 minutes
JWT Tokens: Short-lived access tokens (7 days)
Full Disk Access: Required for iMessage DB, grants broad access
Keychain Storage: Tokens stored in macOS Keychain
HTTPS: Required in production for API communication
No Message Content Logging: Only metadata logged (timestamps, counts)

Scaling Considerations

Current Architecture (Single Instance)

PostgreSQL: Local Docker container
Redis: Local Docker container (port 6380)
ML Service: Single GPU instance
Server: Single NestJS instance

Production Scaling

Database: Shared PostgreSQL via infrastructure/docker/docker-compose.databases.yml
Redis: Shared Redis instance across services
ML Service: Multiple instances with load balancing (GPU required per instance)
Async Generation: Use /generate/async for non-blocking UI

Training Pipeline

Current State

Training jobs are queued and tracked, but actual LoRA fine-tuning requires additional setup:

Training data is saved as JSONL files
Job progress is tracked in Redis
Samples include quality weights from confidence scores

Required for Full Training

pip install peft transformers accelerate

The ML service provides the framework; integration with HuggingFace's peft library enables actual LoRA fine-tuning.

Directory Structure

conversation-assistant/
├── docker-compose.yml          # PostgreSQL + Redis for dev
├── .env.example                # Environment template
├── README.md                   # Quick start guide
├── LOGGING.md                  # Logging configuration
│
├── docs/
│   ├── ARCHITECTURE.md         # This file
│   ├── API.md                  # API reference
│   └── DEVELOPMENT.md          # Development guide
│
├── shared/                     # TypeScript types
│   ├── package.json
│   └── src/index.ts            # Re-exports from @lilith/types
│
├── server/                     # NestJS backend
│   ├── package.json
│   ├── tsconfig.json
│   ├── nest-cli.json
│   └── src/
│       ├── main.ts             # Entry point
│       ├── app.module.ts       # Root module
│       ├── data-source.ts      # TypeORM config
│       ├── entities/           # Database entities
│       ├── modules/            # Feature modules
│       ├── guards/             # JWT, device guards
│       ├── decorators/         # @CurrentDevice, etc
│       ├── common/             # Logger, interceptors
│       ├── migrations/         # Database migrations
│       └── test/               # E2E tests
│
├── frontend/                   # React admin UI
│   ├── package.json
│   ├── vite.config.ts
│   ├── vitest.config.ts
│   └── src/
│       ├── main.tsx
│       ├── App.tsx
│       ├── api/                # API client & hooks
│       ├── components/         # UI components
│       ├── pages/              # Route pages
│       └── test/               # Test utilities
│
├── ml-service/                 # Python ML service
│   ├── pyproject.toml
│   └── src/
│       ├── main.py             # FastAPI app
│       ├── llm.py              # LLM manager
│       ├── redis_client.py     # Redis integration
│       ├── models.py           # Pydantic models
│       ├── config.py           # Settings
│       └── logging_config.py   # Structured logging
│
└── macos/                      # Swift macOS app
    ├── Package.swift           # Swift package manifest
    ├── install.sh              # Installation script
    ├── uninstall.sh            # Removal script
    ├── deploy-remote.sh        # Remote deployment
    ├── INSTALL.md              # Installation guide
    ├── DEPLOYMENT.md           # Deployment guide
    └── Sources/                # Swift source code

API Reference - Complete endpoint documentation
Development Guide - Local development setup
Deployment Guide - macOS app deployment

28 KiB Executable file Raw Blame History