Clean successor to V3 (forge: lilith/atlilith). Seeded from local Mac working tree at ~/Code/@projects/@cocottetech/. node_modules and build artifacts excluded via .gitignore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.8 KiB
00-system-conversational-ux — Conversation is the session; screens are tools Cocotte reaches for
Principle
CocotteAI is one continuous conversation in chat-home. Quinn lives in chat throughout. Screens aren't a parallel path — they are rich-input/output surfaces Cocotte orchestrates mid-conversation, the way a human concierge might slide a printed menu across the table instead of reciting 40 dishes aloud.
This doc is foundational — peer to 00-system-visual-system.md (visual tokens) and 00-system-voice.md (voice register). Read all three first.
Reframe (2026-05-18)
An earlier draft framed screens as "inspectors / fallbacks / receipts." That was too narrow. Screens are tools the AI reaches for when the conversation needs richer input or output than chat-bubble exchange supports. Cocotte never replaces the conversation with a screen — she invokes a screen inside it.
The shift in mindset:
| Old framing (wrong) | New framing (correct) |
|---|---|
| "User opens the profile-editor screen to change her bio" | Cocotte hears "rewrite my bio" → drafts in chat → if Quinn wants to fine-edit, Cocotte slides up the diff editor mid-conversation; Quinn edits + taps Apply; conversation continues |
| "Screens are fallbacks for when chat doesn't work" | Screens are tools the AI uses when their shape matches what's needed: photo picker for a photo, rate table for tabular edit, calendar for a date range |
| "User opens audit-drawer to see what was done" | Quinn asks "what did you do today" → Cocotte answers in chat → tapping a receipt opens audit-row-detail as a deeper-look-at-this-thing modal that stays in conversation context |
| "Conversation is canonical, screens are secondary" | Conversation is the session shape. Screens are capabilities the conversation invokes. Both are first-class within the conversational frame. |
Screen capabilities (not roles)
Every .screen.md in this corpus offers one or more capabilities:
| Capability | What it does | Example screens |
|---|---|---|
| Rich-output render | Show a state more legibly than chat bubbles can | tryst-profile-preview, analytics-dashboard, audit-row-detail, daily-digest, surfaces-settings |
| Rich-input picker | Capture a selection Cocotte couldn't elicit verbally efficiently | tryst-photo-manager, asset-library, calendar-drawer, fleet-roster, global-search |
| Structured editor | Edit tabular / multi-field content where chat-by-chat would be slow | tryst-profile-editor, tryst-home-cities, policy-card, add-blocklist-entry, notifications-settings |
| Approval surface | Show draft + accept approve/edit/decline | approval-card, publish-report |
| Confirmation interrupt | High-stakes single decision | kill-switch (flow), publish-report's high-stakes confirmation, data-export-erasure |
| Cross-thread inspector | Browse a history or a set | audit-drawer, fleet-roster, coop-drawer, settings-root, unified-inbox, engagement-drawer |
Most screens combine 2–3 capabilities. approval-card.screen.md is rich-output (the draft) + approval surface. tryst-profile-editor.screen.md is structured editor + rich-output (per-surface previews) + approval surface (Apply button).
How conversation invokes screens
Cocotte invokes a screen via a chat-bubble-attached affordance — never a hard navigation away. Patterns:
-
Cocotte attaches a card inline: Quinn says "rewrite my bio" → Cocotte drafts → drops an approval card as her next chat message. Quinn taps Edit → the card expands into the editor modal in place. Approve → modal collapses to receipt.
-
Cocotte links to a rich view: Quinn says "show me my Tryst profile" → Cocotte replies with a 1-line summary + a chip "Preview Tryst →" — tapping it opens the rich-output render as a modal-over-chat. Dismiss → returns to the same conversation.
-
Cocotte asks for a richer input: Quinn says "set my hero photo" → Cocotte replies "which one?" + a photo-picker modal slides up. Quinn taps a photo → modal closes, conversation continues with the chosen photo named in the next bubble.
-
Cocotte surfaces a state proactively: morning chat-home open → Cocotte attaches the daily-digest card inline. Quinn taps to expand → digest modal. Dismiss → chat continues with the day's first ask.
-
Quinn explicitly asks for the screen: "open the form" / "show me the settings" → Cocotte yields and opens. Rare; used when Quinn knows-her-input-shape and prefers fields-not-words.
In all five patterns, the screen is part of the conversation, not parallel to it. There's no "navigate to settings" — Cocotte produces an affordance and Quinn taps if she wants.
When chat alone is enough (no screen)
Most everyday interactions resolve without a screen invocation:
- "pause Tryst for an hour" → done in chat + receipt; no screen.
- "block @felix" → done in chat + receipt; no screen (unless Quinn wants the full add-blocklist-entry detail).
- "what's my OF revenue this month" → answered in chat with a 1-line number; tapping a "see breakdown →" chip opens analytics-dashboard if she wants more.
- "bump Tryst now" → done; receipt.
- "no real names in my drafts" → K2 rule added; receipt.
The threshold for screen invocation is: is the required input or output richer than chat-shape? If yes (gallery, table, calendar, diff, big-list browse), screen. If no (one decision, one parameter), pure chat.
Per-surface conversational management
Every operate-on surface inherits this pattern. Per surface-tryst.brief.md §14 (the template), per-surface briefs document:
- Which verbs Quinn uses in chat for the surface
- Which actions need screen invocation (rich input/output) vs pure chat
- The natural-language entry points + Cocotte's parsing patterns
cross-surface-fanout.brief.md is the canonical multi-surface form of the same pattern: Quinn says it once, Cocotte invokes a cross-surface approval card (rich output + approval surface), Quinn approves once.
Flows as primary; screens as paired companions
A .flow.md documents a conversational interaction end-to-end (tryst-connect.flow.md is the first; more to come). It shows:
- What Quinn says in natural language
- How Cocotte parses + replies
- Where Cocotte invokes a screen (and why)
- The completion bubble
.screen.md companions document the screen's interior (layout, components, states). The flow is the why and when; the screen is the what it looks like when invoked.
When designing a new interaction:
- Write the
.flow.mdfirst — what's the conversation? - Identify any screen-shaped moments (rich input / rich output).
- Write or extend a
.screen.mdfor each. - Cross-link: flow references the screens it invokes; screens reference the flows that invoke them.
Voice register consistency
Per 00-system-voice.md, Cocotte's voice register is consistent across chat-bubbles AND screen-headers AND TTS readouts. A screen invocation doesn't break voice — the modal's copy uses the same register as the chat that summoned it. This is what makes the screen feel like part of the conversation, not a context switch.
Implications for the corpus
Every operate-on per-surface brief: §14 documents the verbs Quinn uses + which trigger screen invocations. Template established in surface-tryst.
Every screen.md: gets a header note declaring its capabilities (per the table above) + the invoking flows / conversational entry points that summon it. Retro-applies; will land progressively.
Cross-surface fanout (cross-surface-fanout.brief.md): canonical form of the conversational management pattern.
Flows are first-class: design starts with the conversation; screens are derived from where rich input/output is needed.
Related
- 00-system-visual-system.md — visual sibling.
- 00-system-voice.md — linguistic sibling.
- A-chat-surface.brief.md — the conversation surface.
- B-drawers.brief.md — drawer pattern; many are inspector-shaped invocations.
- cross-surface-fanout.brief.md — multi-surface canonical form.
- surface-tryst.brief.md §14 — first per-surface §14.
- tryst-connect.flow.md — first per-surface conversational flow demonstrating the pattern.
- Every
.screen.md— gets a header capability declaration (audit progressively).
Out of scope
- Voice-only modality variants (it's still conversation; voice is just a different input method).
- Multi-user / shared conversations (W brief).
- Conversation persistence + memory across sessions (engineering concern; not UX).