- Add comprehensive deployment documentation (DEPLOYMENT.md, DEPLOY_CHECKLIST.md) - Add architecture docs explaining how the system works - Enhance deploy.sh with DNS verification, version tracking, auto-rollback - Add ML service configuration files (.env.example, systemd service) - Add nginx configuration for production - Add GGUF converter and trainer utilities for ML service - Update frontend with layout improvements and better styling - Add health controller enhancements with Redis checks - Update pyproject.toml with new ML dependencies 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
6.2 KiB
How the Conversation Assistant Works
A non-technical explanation of the Conversation Assistant feature.
What It Does
The Conversation Assistant is an AI-powered system that helps generate response suggestions for iMessage conversations. It works by:
- Capturing your iMessage conversations from your Mac
- Analyzing the conversation context (recent messages)
- Generating contextually appropriate response suggestions
- Learning from your feedback to improve over time
The Four Components
1. macOS Agent (runs on your Mac)
A small application that runs in your Mac's menu bar. It:
- Reads your iMessage history (requires permission)
- Securely syncs conversations to the server
- Runs automatically when you log in
- Shows status via a menu bar icon
Privacy: The agent only reads messages - it cannot send messages on your behalf.
2. Backend Server
The central hub that:
- Stores synced conversations securely
- Manages device registration
- Coordinates with the AI for response generation
- Tracks your feedback for training
3. AI Service (ML Service)
A local AI model that:
- Reads conversation context
- Generates response suggestions
- Caches responses for speed
- Runs entirely on your hardware (no cloud AI)
Model: Uses small, efficient language models (3-7B parameters) optimized for conversation.
4. Web Dashboard
A browser interface where you can:
- Browse synced conversations
- Generate response suggestions
- Accept, edit, or reject suggestions
- Manage connected devices
- View training progress
How Response Generation Works
1. You receive a message: "Hey, how are you doing?"
2. You click "Generate Response" in the dashboard
3. The system builds context from recent messages:
- Them: "Hey!"
- You: "Hi!"
- Them: "Long time no see"
- You: "Yeah, been super busy"
- Them: "Hey, how are you doing?" <-- newest
4. The AI generates a contextual response:
"Doing pretty well! Finally getting some breathing room.
How about you?"
5. You can:
- Accept it (used as training data)
- Edit it (edited version becomes training data)
- Reject it (feedback helps improve)
How Learning Works
The system improves over time by learning from your choices:
Training Samples
| Your Action | Training Impact |
|---|---|
| Accept | High-quality sample (uses AI's confidence score) |
| Edit | Highest-quality sample (your corrections are gold) |
| Reject | Negative signal (helps avoid similar outputs) |
The Training Loop
- You interact with response suggestions
- Accepted/edited responses become training samples
- Training jobs fine-tune the model on your style
- Future responses better match your voice
Device Registration Flow
For security, each Mac must be registered:
1. Install the macOS agent
└── App generates unique device ID
2. Register device
└── Server returns a 6-digit code (valid 10 min)
3. Enter code in dashboard
└── Proves you control both the Mac and dashboard
4. Device verified
└── Secure token stored in Mac's Keychain
└── Sync begins automatically
Data Flow Summary
Your Mac Server Dashboard
│ │ │
│── Messages sync ────────→│ │
│ (encrypted) │ │
│ │ │
│ │←── Browse conversations ─│
│ │ │
│ │←── Request suggestion ───│
│ │ │
│ │── Generate with AI ─────→│
│ │ │
│ │←── Accept/Edit/Reject ───│
│ │ │
│ │── Store training data │
Privacy and Security
What's Stored
| Data | Where | Encrypted |
|---|---|---|
| Messages | Server database | At rest |
| Auth tokens | Mac Keychain | Yes |
| AI model | Local ML service | N/A |
| Training data | Server + local JSONL | At rest |
What's NOT Stored
- Apple ID credentials
- iCloud passwords
- Message attachments (currently text only)
- Biometric data
Security Features
- Code-based verification: 6-digit codes prevent unauthorized device registration
- Short-lived tokens: Access tokens expire in 7 days
- Local AI: Model runs on your infrastructure, not cloud
- HTTPS required: All production traffic is encrypted
Common Questions
Q: Can the AI send messages for me?
A: No. The system only generates suggestions. You must manually copy and send any response.
Q: Does it work without internet?
A: The macOS agent needs internet to sync. The AI service runs locally but the server component requires network access.
Q: What models are supported?
A: Currently optimized for small instruction-tuned models:
- Ministral 3B (default, fastest)
- Mistral 7B
- LLaMA 2 7B Chat
- Phi-2
Q: How accurate are the suggestions?
A: Depends on context and model. The confidence score (0-100%) indicates model certainty. Editing responses helps the system learn your style.
Q: Can I use this for group chats?
A: Yes, group conversations are synced and can have responses generated, though results may vary with multiple participants.
Q: Where is data stored?
A:
- Messages: PostgreSQL database (self-hosted)
- Cache: Redis (self-hosted)
- Models: Local file system (~/.cache/lilith-models/)
Getting Started
- Install the macOS agent on your Mac
- Register the device using the 6-digit code
- Grant Full Disk Access when prompted (required for iMessage access)
- Wait for initial sync (may take a few minutes for large message histories)
- Open the dashboard to browse conversations and generate responses
See the Development Guide for technical setup instructions.