12 KiB
provider-ai-interaction-modes — how the provider reaches the AI
Phase: P0 (all four modes) Org scope (per W §W4, forward-compat for P5+): personal
Four distinct modes. The AI's state persists across all of them. The provider does not switch modes explicitly — the mode is determined by how an interaction is initiated.
Mode 1 — Conversational (iOS primary, web secondary)
Trigger surface: Provider opens CocotteAI (iOS @features/ai-copilot/ios-fe) or ai.cocotte.maison (web-fe) and types or speaks.
Channel: Text chat or voice (STT via @model-boss → response → TTS via @model-boss speech-synthesis path).
Latency expectation: First token ≤1s; streaming complete ≤8s for typical conversational turn. Voice round-trip (STT → inference → TTS → audio) ≤4s for short inputs.
Action confirmation policy: Per action stakes classification (see 00-system-visual-system.md §F1). Low-stakes actions execute immediately with an inline receipt. Medium-stakes surface an approval card. High-stakes surface a high-stakes-interrupt card and require explicit confirmation (never auto). The provider can escalate any card to high-stakes manually.
Escalation path: When the AI has insufficient context to proceed (ambiguous target entity, conflicting standing instructions, provider has not approved a required prior step), it asks a clarifying question in the chat stream rather than guessing. Never silent assumption.
State persistence: Conversation history is stored per device thread in platform.api. The current thread is included in context assembly on every turn (see provider-ai-context.contract.md). Cross-device pickup is handled via cross-device-handoff.flow.md — a banner in the chat surface indicates recent activity from another device.
Voice details:
- Push-to-talk: hold mic button; release sends.
- Hands-free long-press: short-window listening for yes/no/edit intents (brief A §5).
- TTS delivery: response audio plays inline in chat; transcript displayed simultaneously (provider can read if audio is muted).
- STT + TTS both route to
@model-bosson apricot via HTTP (never on-device weights).
Multimodal input: Photo drop into the chat input bar triggers the asset-onboarding path (content-specialist variant production). Share-sheet from Photos or Files supported at P0.
Mode 2 — Proactive (provider hasn't asked; AI surfaces)
Trigger surface: AI-initiated push notification, iMessage fallback, or daily digest card in chat. Provider does not need to have the app open.
Trigger sources:
| Source | What fires it | Delivery |
|---|---|---|
| Daily digest | Scheduled via scheduler-worker; nightly summary of specialist activity + next-day calendar |
iOS push (low stakes) → chat card on tap (see I §1) |
| Urgent-attention alert | scheduler-worker or real-time event crosses an urgency threshold |
iOS push (medium/high stakes per C-notifications.brief.md) |
| Deposit cleared | platform-finance receives confirmation webhook |
iOS push → chat card |
| Prospect replied | engagement-ingestor receives inbound from any surface |
iOS push → triage card in chat |
| Hotel price change | tour-scout worker detects rate delta on a saved preference |
iOS push → chat card |
| Prospect-ghosting signal | prospect-resolver drops warmth score below threshold |
iOS push → chat card (medium stakes) |
| Autonomous-rule receipt (see Mode 3) | Standing instruction fired; receipt for awareness | Low-stakes push → chat receipt card |
Latency expectation: Real-time events (inbound message, deposit) ≤30s to push delivery. Scheduled digest: time-of-day configurable (default 08:00 provider local time). Hotel/ghost signals: hourly scheduler-worker tick; ≤60 min from event to notification.
Action confirmation policy: Proactive surfaces always require provider action before any write. The notification opens chat at the relevant card; the provider confirms or dismisses. The AI never auto-writes in response to a proactive trigger.
Escalation path: If the provider does not respond to a high-stakes proactive card within a configurable window (default 4h), the card is re-surfaced with an escalation note. If the window expires for a time-sensitive action (e.g., a prospect follow-up whose window closes), the card is re-surfaced with a "window closing" badge and the action is marked deferred, not auto-taken.
iMessage fallback: When the provider has @mac-sync connected and notifier cannot deliver via APNs (e.g., provider's phone is offline), high-stakes alerts fall back to iMessage via @mac-sync's send-queue. Low-stakes alerts do not fall back; they wait for the next app open.
State persistence: Proactive events are recorded in agent_actions with status=surfaced on emission. If the provider approves, status→completed. If dismissed, status→dismissed. Both are replayable in the audit drawer (brief I).
Mode 3 — Autonomous (provider has approved standing instructions)
Trigger surface: No provider interaction at point of execution. The provider has previously approved a standing rule that permits the AI to act without per-action confirmation.
Standing instruction examples:
- "Auto-reply to warm prospects who message before I do with the onboarding template."
- "Auto-schedule Tryst bumps every 4 hours between 08:00 and 22:00."
- "If a prospect has been warm for 72 hours without reply, send the follow-up template."
- "Publish any content-onlyfans draft with confidence ≥0.85 that has passed the banned-phrase filter."
How standing instructions are created: The provider either dictates them conversationally (Mode 1) and the AI proposes a rule structure for approval, or the AI proposes a rule unprompted after observing a repeated manual pattern (e.g., "You've done the same follow-up cadence 6 times — want me to automate it?"). No rule is written without explicit provider approval on a high-stakes confirmation card.
Execution: scheduler-worker evaluates active standing instructions on each tick (period per rule). When conditions are met, the relevant specialist action is invoked directly without surfacing an approval card. The action writes an agent_actions row with trigger=autonomous_rule and approved_rule_id.
Receipts: Each autonomous action surfaces a low-stakes notification ("Sent follow-up to @felix · via iMessage · 3 min ago — standing rule #4"). The receipt card in chat allows the provider to counter-act (see I §4) or modify the rule.
Latency expectation: scheduler-worker tick interval per rule (minimum 1 min for inbox reply, 4h for bump cycles, 24h for outreach follow-ups). The action itself executes in ≤10s from trigger.
Action confirmation policy: Autonomous mode is specifically the exception to normal confirmation flow. The confirmation happened when the rule was created. Any autonomous action that fails the action's own pre-execution validation (e.g., prospect is now blocked, surface is in vacation mode) is halted and surfaced as a proactive alert instead.
Escalation path: If an autonomous rule fires but the executing specialist returns a low-confidence result or a validation failure, the action is not executed; it is escalated to the provider as a Mode 2 proactive alert. The rule remains active; this instance is marked exception.
Kill-switch: The provider can suspend all autonomous rules instantly ("pause everything") via a single conversation turn. Per brief K §K5. Individual rules can be paused, modified, or deleted conversationally.
State persistence: Standing instructions stored in scheduled_rules in platform.db. All autonomous action outcomes are audit-replayable.
Mode 4 — Glue-mode (Claude Code as developer interface)
Trigger surface: The provider is iterating on the platform itself (or a developer is) and drives @ai MCP tools directly via Claude Code, bypassing the iOS/web chat surface entirely.
When this is appropriate: Rare. Development and debugging of specialist behavior, action skill iteration, context provider tuning, or MCP surface exploration. Not the daily-driver pattern — this is the power-developer path.
How it works: Claude Code connects to ai-copilot:3791 (or another specialist instance) via the MCP protocol and invokes tools directly. The @ai/ai-core substrate exposes MCP endpoints for skill invocation, context inspection, and audit log access. Actions taken this way still write agent_actions rows with trigger=mcp_direct.
Latency expectation: Same as Mode 1 AI inference path; no additional overhead from bypassing the iOS/web layer.
Action confirmation policy: MCP direct invocations skip the iOS approval-card UI but still evaluate the action's own pre-execution guards (tenant scope, PII boundary, stakes flags). High-stakes actions require the caller to pass confirm: true in the MCP call or they return a requires_confirmation response without executing.
Escalation path: N/A — there is no human in the loop for MCP direct calls during development. The developer is the loop.
State persistence: Same as all other modes — platform.api is authoritative, agent_actions is the audit spine. Glue-mode calls are distinguishable in the audit by trigger=mcp_direct.
Caution: Actions taken in glue-mode against the production platform.db are real and irreversible. There is no sandbox DB (see CLAUDE.md — conventions). Test DBs are ephemeral Docker containers per test run.
State persistence across modes
All four modes write to and read from the same authoritative state:
| Layer | Technology | Shared across modes |
|---|---|---|
| Authoritative data | platform.api:3060 → platform.db:25437 on black |
Yes — single source of truth |
| AI conversation history | platform.api → chat_threads table |
Per device thread (cross-device: see handoff flow) |
| Approval cards (pending) | agent_actions where status=pending |
Yes — visible in any mode |
| Autonomous rules | scheduled_rules table |
Yes — created in Mode 1, executed in Mode 3, auditable in Mode 1/2 |
| Context cache | Redis (vps-0), invalidated via pub/sub | Shared; next turn in any mode assembles fresh after invalidation |
Open questions
- Voice trigger word (Mode 1): push-to-talk vs always-on listen vs wake-word. Open in OPEN-DECISIONS.md → A-Q1.
- Standing instruction editor UI (Mode 3): conversational creation is defined here; a dedicated screen for browsing + editing all active rules is not yet designed. Related: H-recurring-chores.brief.md §H1.
- Glue-mode MCP schema: the full MCP tool list exposed by
ai-copilot:3791is not yet enumerated. Belongs in an_engineering-ai-mcp-surface.mdcompanion doc.
Related
- provider-ai-entrypoint.brief.md — north star; why these modes collapse v2's split surfaces.
- provider-ai-context.contract.md — context assembled for Mode 1 turns.
- provider-ai-actions.contract.md — full action surface available across all modes.
- A-chat-surface.brief.md — Mode 1 iOS interaction design.
- C-notifications.brief.md — Mode 2 push delivery + stakes-aware routing.
- H-recurring-chores.brief.md — Mode 3 standing instructions (bumps, vacation, multi-surface policies).
- cross-device-handoff.flow.md — state continuity when provider switches between Mode 1 devices.
- I-audit-trust-replay.brief.md — audit trail for all modes; counter-action entry point.
- K-safety-blocklist.brief.md §K5 — kill-switch for Mode 3.