cocottetech/@platform/codebase/@features/ai-copilot/docs/A-chat-surface.brief.md

# A — Chat surface (primary)

## Goal
The chat with `ai-copilot` is Quinn's daily driver in **CocotteAI** (iOS app, consumer-facing name). She opens the app → sees what needs her attention → approves, asks, gets nudged. Conversation IS navigation; visual surfaces are reachable from chat (see brief B), not the default.

## Designer skim

- **Headline UX**: Open app → see attention-cards inline in the chat stream → swipe-right approve / tap-edit / swipe-left reject. Voice push-to-talk + photo drop are first-class input.
- **States (10)**: first-run, mid-conversation, awaiting approval, voice push-to-talk, voice hands-free, streaming reply, post-decision, specialist mention, error/blocked publish, offline.
- **Pair-with**: [`chat-home.screen.md`](./chat-home.screen.md), [`approval-card.screen.md`](./approval-card.screen.md), [`day-in-life.flow.md`](./day-in-life.flow.md).
- **Blocking Qs**: see [OPEN-DECISIONS.md](./OPEN-DECISIONS.md) → A-Q1 voice-trigger, A-Q2 card-count collapse, A-Q3 streaming-reply renderer.

## Constraints
- iOS 17+, SwiftUI, Swift 5.9+; built from `~/Code/@applications/lilith-messenger-ios` foundations.
- Companion-led IA: **no bottom tabs**. Top-bar reveals specialist threads + settings overlay.
- Voice-capable from P0 (push-to-talk; long-press hands-free for short answers).
- Multimodal input: drag a photo into the input bar → drops as a chat message that triggers the variant-producer flow.
- Streaming responses (token-by-token) for ai-copilot replies.
- Offline-tolerant: reuse `lilith-messenger-ios/Core/Persistence/{SyncEngine,MessageStore,ThreadCache}.swift` adapted to V3 schema.
- Optimistic UI on approvals: card animates out immediately; mutation queues via SyncEngine; retry on reconnect.

## Inputs
- `POST /api/v1/chat` on `ai-copilot:3791` returns
  `{ reply: string, cards: ApprovalCard[], specialist: string }`.
- `GET /api/v1/chat/pending-approvals/:user_id` returns `ApprovalCard[]`.
- `ApprovalCard` shape:
  ```ts
  {
    card_id: string;
    kind: 'content_post' | 'content_plan' | 'engagement_event';
    ref_id: string;
    surface: SurfaceKind;       // 'onlyfans' | 'x' | 'instagram' | ...
    scheduled_for: string | null;
    stakes: 'low' | 'medium' | 'high';
    confidence: number;          // 0..1
    title: string;
    body_preview: string;
  }
  ```

## States to design
1. **First-run / empty** — no plans, no engagement, no approvals. Suggest "tell me about yourself" → persona-seed flow (brief D).
2. **Mid-conversation** — Quinn typing; companion replied; no cards.
2b. **Multi-message turn assembling** — Quinn just sent N messages within the silence window; ai-copilot hasn't started replying. Visual: thin pulsing dot under last bubble, "Send now" affordance visible.
2c. **Interrupted streaming reply** — Quinn sent a new message mid-reply; the partial reply gets a small "stopped here" marker; new turn starts.
3. **Awaiting approval** — N cards inline. Mix of stakes / kinds / surfaces.
4. **Voice-active (push-to-talk)** — mic depressed; waveform; transcript scrolling.
5. **Voice-active (long-press hands-free)** — short-window listening with "yes / no / edit" intent recognition.
6. **Streaming reply** — partial assistant message rendering token-by-token.
7. **Post-decision confirmation** — Quinn approved/edited/rejected; toast or inline confirmation.
8. **Specialist mention** — companion says "strategist drafted 14 days" — `strategist` is tappable → opens specialist drawer (brief B).
9. **Error / blocked publish** — surface adapter failed; card flips to `needs_attention` with retry/reschedule/escalate.
10. **Offline** — read-only with queued-mutations indicator; cached calendar still browsable.

## Interactions

### Multi-message partial requests (debounced turns)

Real chat is fragmented — Quinn sends "wait", "actually", "and put it Friday not Thursday" across three messages in 4 seconds. ai-copilot must **not** fire three separate responses. The chat surface debounces incoming Quinn messages into a single turn.

**Behavior:**
- After Quinn sends a message, ai-copilot waits a **silence window** (default: 2.5s, voice 1.5s) before treating the turn as complete.
- If Quinn sends another message within the window, the timer resets and both messages compose the turn.
- Streaming reply doesn't start until the window closes.
- Hard cap: 30s of accumulated typing — at that point ai-copilot starts the turn anyway, with the messages-so-far. Quinn's later messages become a new turn.

**Visible affordances:**
- While the silence window is counting down, a thin progress dot pulses under Quinn's last message ("composing turn…"). Tap-to-skip immediately fires the turn.
- "Send now" affordance (small ↗ button next to input) ends the window early.
- Multi-message bubbles render visually grouped — same author cluster, no avatar between, tighter vertical spacing — so the reader sees them as one thought.

**Voice equivalent:**
- Push-to-talk: Quinn's speech segments coalesce until she taps "done" or pauses 1.5s.
- Hands-free: VAD-based turn-taking; ai-copilot replies after sustained silence (1.5s) — matches V2 §V2a hearth-register's read-aloud constraint.

**Edge cases:**
- Quinn sends a message → immediately taps an approval card → that's a single turn with both: chat message + card action. ai-copilot sees the full context, not two split inputs.
- Quinn sends a message during ai-copilot's streaming reply → reply cancels, both messages compose a new turn (per voice §V2c plain register: never talk over the user).
- Quinn types, walks away mid-sentence — at the 30s hard cap, ai-copilot fires the turn with what it has. Quinn returning later just starts a new turn.

**Open Q (added as A-Q4)**: should the silence window be Quinn-tunable (settings: "I type slow" / "fast")? Lean: tunable via voice ("type slower" / "type faster") rather than settings — keep settings minimal.

### Other gestures
- **Approval card gestures**: swipe right = approve, swipe left = reject, tap = open edit drawer. Haptic on each.
- **Stakes badge** top-right (low = gray dot, medium = yellow chip, high = red chip + display haptic).
  - **Long-press the badge** reveals a small "why" popover: "High because: PPV pricing involves a money commitment + this fan's first purchase. Confidence 0.78 — below the auto-publish threshold."
  - The popover cites the policy that produced the classification — pulled from the same `outcome_json.why` field that the audit row exposes (brief I3). One source, two surfaces.
- **Confidence** as thin bar (0–100%) along card top.
- **Batch mode**: long-press a card → enters multi-select; bottom bar offers Approve N / Reject N / Defer N.
- **Photo drop**: drag image into input bar OR iOS Share Sheet from Photos → ai-copilot opens variant-producer with vision request to `@model-boss`.

## In-the-wild copy

(Pulled from [voice](./00-system-voice.md) §V5; register noted.)

**State 1 · first-run empty** (hearth, dialed warmer):
> Welcome. Tell Cocotte what you tend to, and what you don't want anyone touching. Five minutes, then she takes over.

**State 2 · multi-message turn assembling** (hearth — ambient cue, not interrupt):
> · composing turn…   (pulses under Quinn's last bubble; tap to fire now)

**State 6 · interrupted streaming reply** (plain marker on the cancelled partial):
> ai-copilot stopped here · your turn

**State 3 · awaiting approval, OF post card** (working):
> content-onlyfans has three drafts in the drawer. The middle one's the tour-tease — confidence 0.83. Approve to send 9pm, edit before you send, or set aside.

**State 7 · post-decision confirmation, low-stakes** (hearth):
> Tucked in. Receipt's in the digest.

**State 9 · error / blocked publish** (plain):
> Tryst rejected the last bump. You're not visible there right now. Re-auth or pause?

**Stakes-badge long-press popover, high** (working, terse):
> High because PPV pricing involves a money commitment + this fan's first purchase. Confidence 0.78 — below the auto-publish floor.

## Out of scope
- Multiple-thread navigation (one direct thread per specialist is P5).
- Avatar / @chobit integration (P5+).
- iPad/macOS Catalyst layouts (see brief E).

## Open questions
- **A-Q1** Voice trigger word vs always tap-to-talk?
- **A-Q2** What is the upper bound on inline cards before we collapse to a "+N more" stack?
- **A-Q3** Streaming reply rendering: full markdown vs plaintext vs constrained rich-card markup?
- **A-Q4** Silence-window duration tunable per-Quinn? Lean: tunable via voice command ("type slower"/"type faster"), not buried in settings. Default 2.5s text / 1.5s voice.
- **A-Q5** When Quinn interrupts a streaming reply with a new message, does ai-copilot keep the partial reply visible (with a "cancelled here" marker) or remove it entirely? Lean: keep visible with marker; helps Quinn see what the model was about to say.