128 lines
8.8 KiB
Markdown
128 lines
8.8 KiB
Markdown
|
|
# A — Chat surface (primary)
|
|||
|
|
|
|||
|
|
## Goal
|
|||
|
|
The chat with `ai-copilot` is Quinn's daily driver in **CocotteAI** (iOS app, consumer-facing name). She opens the app → sees what needs her attention → approves, asks, gets nudged. Conversation IS navigation; visual surfaces are reachable from chat (see brief B), not the default.
|
|||
|
|
|
|||
|
|
## Designer skim
|
|||
|
|
|
|||
|
|
- **Headline UX**: Open app → see attention-cards inline in the chat stream → swipe-right approve / tap-edit / swipe-left reject. Voice push-to-talk + photo drop are first-class input.
|
|||
|
|
- **States (10)**: first-run, mid-conversation, awaiting approval, voice push-to-talk, voice hands-free, streaming reply, post-decision, specialist mention, error/blocked publish, offline.
|
|||
|
|
- **Pair-with**: [`chat-home.screen.md`](./chat-home.screen.md), [`approval-card.screen.md`](./approval-card.screen.md), [`day-in-life.flow.md`](./day-in-life.flow.md).
|
|||
|
|
- **Blocking Qs**: see [OPEN-DECISIONS.md](./OPEN-DECISIONS.md) → A-Q1 voice-trigger, A-Q2 card-count collapse, A-Q3 streaming-reply renderer.
|
|||
|
|
|
|||
|
|
## Constraints
|
|||
|
|
- iOS 17+, SwiftUI, Swift 5.9+; built from `~/Code/@applications/lilith-messenger-ios` foundations.
|
|||
|
|
- Companion-led IA: **no bottom tabs**. Top-bar reveals specialist threads + settings overlay.
|
|||
|
|
- Voice-capable from P0 (push-to-talk; long-press hands-free for short answers).
|
|||
|
|
- Multimodal input: drag a photo into the input bar → drops as a chat message that triggers the variant-producer flow.
|
|||
|
|
- Streaming responses (token-by-token) for ai-copilot replies.
|
|||
|
|
- Offline-tolerant: reuse `lilith-messenger-ios/Core/Persistence/{SyncEngine,MessageStore,ThreadCache}.swift` adapted to V3 schema.
|
|||
|
|
- Optimistic UI on approvals: card animates out immediately; mutation queues via SyncEngine; retry on reconnect.
|
|||
|
|
|
|||
|
|
## Inputs
|
|||
|
|
- `POST /api/v1/chat` on `ai-copilot:3791` returns
|
|||
|
|
`{ reply: string, cards: ApprovalCard[], specialist: string }`.
|
|||
|
|
- `GET /api/v1/chat/pending-approvals/:user_id` returns `ApprovalCard[]`.
|
|||
|
|
- `ApprovalCard` shape:
|
|||
|
|
```ts
|
|||
|
|
{
|
|||
|
|
card_id: string;
|
|||
|
|
kind: 'content_post' | 'content_plan' | 'engagement_event';
|
|||
|
|
ref_id: string;
|
|||
|
|
surface: SurfaceKind; // 'onlyfans' | 'x' | 'instagram' | ...
|
|||
|
|
scheduled_for: string | null;
|
|||
|
|
stakes: 'low' | 'medium' | 'high';
|
|||
|
|
confidence: number; // 0..1
|
|||
|
|
title: string;
|
|||
|
|
body_preview: string;
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## States to design
|
|||
|
|
1. **First-run / empty** — no plans, no engagement, no approvals. Suggest "tell me about yourself" → persona-seed flow (brief D).
|
|||
|
|
2. **Mid-conversation** — Quinn typing; companion replied; no cards.
|
|||
|
|
2b. **Multi-message turn assembling** — Quinn just sent N messages within the silence window; ai-copilot hasn't started replying. Visual: thin pulsing dot under last bubble, "Send now" affordance visible.
|
|||
|
|
2c. **Interrupted streaming reply** — Quinn sent a new message mid-reply; the partial reply gets a small "stopped here" marker; new turn starts.
|
|||
|
|
3. **Awaiting approval** — N cards inline. Mix of stakes / kinds / surfaces.
|
|||
|
|
4. **Voice-active (push-to-talk)** — mic depressed; waveform; transcript scrolling.
|
|||
|
|
5. **Voice-active (long-press hands-free)** — short-window listening with "yes / no / edit" intent recognition.
|
|||
|
|
6. **Streaming reply** — partial assistant message rendering token-by-token.
|
|||
|
|
7. **Post-decision confirmation** — Quinn approved/edited/rejected; toast or inline confirmation.
|
|||
|
|
8. **Specialist mention** — companion says "strategist drafted 14 days" — `strategist` is tappable → opens specialist drawer (brief B).
|
|||
|
|
9. **Error / blocked publish** — surface adapter failed; card flips to `needs_attention` with retry/reschedule/escalate.
|
|||
|
|
10. **Offline** — read-only with queued-mutations indicator; cached calendar still browsable.
|
|||
|
|
|
|||
|
|
## Interactions
|
|||
|
|
|
|||
|
|
### Multi-message partial requests (debounced turns)
|
|||
|
|
|
|||
|
|
Real chat is fragmented — Quinn sends "wait", "actually", "and put it Friday not Thursday" across three messages in 4 seconds. ai-copilot must **not** fire three separate responses. The chat surface debounces incoming Quinn messages into a single turn.
|
|||
|
|
|
|||
|
|
**Behavior:**
|
|||
|
|
- After Quinn sends a message, ai-copilot waits a **silence window** (default: 2.5s, voice 1.5s) before treating the turn as complete.
|
|||
|
|
- If Quinn sends another message within the window, the timer resets and both messages compose the turn.
|
|||
|
|
- Streaming reply doesn't start until the window closes.
|
|||
|
|
- Hard cap: 30s of accumulated typing — at that point ai-copilot starts the turn anyway, with the messages-so-far. Quinn's later messages become a new turn.
|
|||
|
|
|
|||
|
|
**Visible affordances:**
|
|||
|
|
- While the silence window is counting down, a thin progress dot pulses under Quinn's last message ("composing turn…"). Tap-to-skip immediately fires the turn.
|
|||
|
|
- "Send now" affordance (small ↗ button next to input) ends the window early.
|
|||
|
|
- Multi-message bubbles render visually grouped — same author cluster, no avatar between, tighter vertical spacing — so the reader sees them as one thought.
|
|||
|
|
|
|||
|
|
**Voice equivalent:**
|
|||
|
|
- Push-to-talk: Quinn's speech segments coalesce until she taps "done" or pauses 1.5s.
|
|||
|
|
- Hands-free: VAD-based turn-taking; ai-copilot replies after sustained silence (1.5s) — matches V2 §V2a hearth-register's read-aloud constraint.
|
|||
|
|
|
|||
|
|
**Edge cases:**
|
|||
|
|
- Quinn sends a message → immediately taps an approval card → that's a single turn with both: chat message + card action. ai-copilot sees the full context, not two split inputs.
|
|||
|
|
- Quinn sends a message during ai-copilot's streaming reply → reply cancels, both messages compose a new turn (per voice §V2c plain register: never talk over the user).
|
|||
|
|
- Quinn types, walks away mid-sentence — at the 30s hard cap, ai-copilot fires the turn with what it has. Quinn returning later just starts a new turn.
|
|||
|
|
|
|||
|
|
**Open Q (added as A-Q4)**: should the silence window be Quinn-tunable (settings: "I type slow" / "fast")? Lean: tunable via voice ("type slower" / "type faster") rather than settings — keep settings minimal.
|
|||
|
|
|
|||
|
|
### Other gestures
|
|||
|
|
- **Approval card gestures**: swipe right = approve, swipe left = reject, tap = open edit drawer. Haptic on each.
|
|||
|
|
- **Stakes badge** top-right (low = gray dot, medium = yellow chip, high = red chip + display haptic).
|
|||
|
|
- **Long-press the badge** reveals a small "why" popover: "High because: PPV pricing involves a money commitment + this fan's first purchase. Confidence 0.78 — below the auto-publish threshold."
|
|||
|
|
- The popover cites the policy that produced the classification — pulled from the same `outcome_json.why` field that the audit row exposes (brief I3). One source, two surfaces.
|
|||
|
|
- **Confidence** as thin bar (0–100%) along card top.
|
|||
|
|
- **Batch mode**: long-press a card → enters multi-select; bottom bar offers Approve N / Reject N / Defer N.
|
|||
|
|
- **Photo drop**: drag image into input bar OR iOS Share Sheet from Photos → ai-copilot opens variant-producer with vision request to `@model-boss`.
|
|||
|
|
|
|||
|
|
## In-the-wild copy
|
|||
|
|
|
|||
|
|
(Pulled from [voice](./00-system-voice.md) §V5; register noted.)
|
|||
|
|
|
|||
|
|
**State 1 · first-run empty** (hearth, dialed warmer):
|
|||
|
|
> Welcome. Tell Cocotte what you tend to, and what you don't want anyone touching. Five minutes, then she takes over.
|
|||
|
|
|
|||
|
|
**State 2 · multi-message turn assembling** (hearth — ambient cue, not interrupt):
|
|||
|
|
> · composing turn… (pulses under Quinn's last bubble; tap to fire now)
|
|||
|
|
|
|||
|
|
**State 6 · interrupted streaming reply** (plain marker on the cancelled partial):
|
|||
|
|
> ai-copilot stopped here · your turn
|
|||
|
|
|
|||
|
|
**State 3 · awaiting approval, OF post card** (working):
|
|||
|
|
> content-onlyfans has three drafts in the drawer. The middle one's the tour-tease — confidence 0.83. Approve to send 9pm, edit before you send, or set aside.
|
|||
|
|
|
|||
|
|
**State 7 · post-decision confirmation, low-stakes** (hearth):
|
|||
|
|
> Tucked in. Receipt's in the digest.
|
|||
|
|
|
|||
|
|
**State 9 · error / blocked publish** (plain):
|
|||
|
|
> Tryst rejected the last bump. You're not visible there right now. Re-auth or pause?
|
|||
|
|
|
|||
|
|
**Stakes-badge long-press popover, high** (working, terse):
|
|||
|
|
> High because PPV pricing involves a money commitment + this fan's first purchase. Confidence 0.78 — below the auto-publish floor.
|
|||
|
|
|
|||
|
|
## Out of scope
|
|||
|
|
- Multiple-thread navigation (one direct thread per specialist is P5).
|
|||
|
|
- Avatar / @chobit integration (P5+).
|
|||
|
|
- iPad/macOS Catalyst layouts (see brief E).
|
|||
|
|
|
|||
|
|
## Open questions
|
|||
|
|
- **A-Q1** Voice trigger word vs always tap-to-talk?
|
|||
|
|
- **A-Q2** What is the upper bound on inline cards before we collapse to a "+N more" stack?
|
|||
|
|
- **A-Q3** Streaming reply rendering: full markdown vs plaintext vs constrained rich-card markup?
|
|||
|
|
- **A-Q4** Silence-window duration tunable per-Quinn? Lean: tunable via voice command ("type slower"/"type faster"), not buried in settings. Default 2.5s text / 1.5s voice.
|
|||
|
|
- **A-Q5** When Quinn interrupts a streaming reply with a new message, does ai-copilot keep the partial reply visible (with a "cancelled here" marker) or remove it entirely? Lean: keep visible with marker; helps Quinn see what the model was about to say.
|