Clean successor to V3 (forge: lilith/atlilith). Seeded from local Mac working tree at ~/Code/@projects/@cocottetech/. node_modules and build artifacts excluded via .gitignore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
21 KiB
X — Accessibility deep-dive
Goal
Accessibility is touched piecewise across the corpus — voice §V8 ships a TTS read-aloud check, visual-system §F2 hands us Dynamic Type, §F4 names motion + haptics, §F1 promises WCAG AA. None of those briefs own accessibility end-to-end. This brief is the cross-cutting backbone: it states the access invariants every surface in CocotteAI inherits, names the audits each lettered brief owes back, and defines the AT (assistive-technology) behaviors — VoiceOver, Dynamic Type, Reduced Motion, Switch Control, voice-as-input — that are P0, not P2 polish. Cocotte's job is intermittent supervision of a fleet; if Quinn can't supervise it hands-free, eyes-off, in poor light, mid-tour, half-asleep, the autonomy story doesn't hold.
Designer skim
- Headline UX: Every surface ships AT-equivalent for every gesture from day one. Swipe-approve has a one-tap-confirm alt. Streaming reply has a "delivered as one block" alt. Stakes badge announces its semantics, not its color. Kill switch is reachable in ≤3 sequential focus steps from anywhere.
- Sections (7): X1 VoiceOver · X2 Dynamic Type · X3 Reduced Motion · X4 Color contrast · X5 Switch Control · X6 Voice-to-everything · X7 Cross-brief audits.
- Pair-with: 00-system-voice §V8, 00-system-visual-system §F1/F2/F4, S-settings-ia §S2.
- Blocking Qs: see OPEN-DECISIONS.md → X-Q1 reduced-motion gate default, X-Q3 voice-only keyboard-free mode
[blocking].
Constraints
- WCAG 2.2 AA is the floor — visual-system §F1 already commits. This brief enforces it across every component and every state, not just the token table.
- iOS-native AT first. VoiceOver, Dynamic Type, Reduced Motion, Switch Control, Voice Control all land via SwiftUI accessibility modifiers — no custom AT runtime. The CocotteAI brief explicitly does not ship a parallel screen-reader; we trust the platform.
- Web companion (G) inherits HTML semantics + ARIA; same invariants reachable via NVDA / JAWS / VoiceOver-macOS. No web-specific AT runtime either.
- Voice — push-to-talk + hands-free per A — is itself an accessibility primitive, not just a power-user feature. Anyone who can't use a touch screen reliably should be able to drive Cocotte by speaking.
- Accessibility prefs surface in S §S2 (voice & input) and §S8 (privacy & data) — this brief defines the contracts, S2 hosts the toggles.
- Hard rule: no surface in the corpus may rely on color alone, animation alone, or haptic alone to communicate state. Every signal needs a textual / structural / semantic equivalent.
States to design (rolled up)
- VoiceOver on · element receives focus · element announces label + value + hint + traits.
- Dynamic Type at XS, M (default), L (most common bump), XXL, XXXL — every card / drawer / chip reflows or truncates gracefully.
- Reduced Motion on · all swipe-approve animations replaced with static one-tap-confirm; streaming reply delivered as a single block; no haptic on stakes-badge display.
- Color contrast verified light + dark across stakes palette + voice register palettes.
- Switch Control on · sequential focus order through any card stack respects reading order; every drawer has an escape; kill switch reachable.
- Voice-only operation · Quinn can complete a full approval cycle (open card → understand stakes → approve / reject / edit) using only mic input and TTS output.
X1 — VoiceOver across the corpus
Every component in visual-system §F7 ships with a VoiceOver contract. Owners audit per-brief and update when copy changes.
X1a — Chat stream (brief A)
- Each chat bubble announces: speaker name · register-appropriate label · "received N seconds ago".
- Streaming reply: do NOT announce token-by-token (that's an audio nightmare). Buffer the partial; announce on completion with "ai-copilot replied" + body. The visual stream still streams (sighted users); VO sees the completed message.
- Multi-message turn assembling (A state 2b): announce once when the silence window closes — "you sent N messages, ai-copilot is reading."
- Specialist mention (A state 8): tappable text announces as button — "strategist · button · opens strategist drawer."
X1b — Approval cards (brief A, screen approval-card)
- Card announces: surface name (not just glyph) · kind · stakes (semantic: "low stakes" / "medium stakes" / "high stakes" — never color) · confidence percent · title · body preview · "swipe up for actions" (custom rotor).
- Custom VoiceOver rotor actions: Approve / Reject / Edit / Defer / Open audit row. Replaces the swipe-right / swipe-left gestures one-for-one. No gesture-only paths.
- Stakes badge as separate focusable element with
accessibilityHint: "long-press for why"— long-press popover (A §gestures) opens with a tap-and-hold on AT, body announced as a single block. - Confidence bar: announces "confidence 0.83 of 1" — not just visual fill.
X1c — Drawers (brief B)
- Drawer opens as a modal — VoiceOver focus moves to drawer title, returns to opener on dismiss.
- Each drawer ships a single, unambiguous title (no "untitled sheet").
- Audit row detail (screen
audit-row-detail): six section headers ("Meta", "Why", "Result", "Feedback", "Counter-actions", "Lineage", "Raw") — navigable by heading rotor.
X1d — Audit rows (brief I)
- Each row is a single accessibility element by default; double-tap opens the detail sheet.
- Two-finger swipe down (VO read-from-here) reads: timestamp · specialist · surface · action · outcome — in that fixed order. Localizable.
X1e — Multi-surface fan-out (brief H §H4)
- A fan-out approval card (one approval, N surfaces affected) announces the count first, then enumerates surfaces — "Berlin tour announcement · 8 surfaces · Tryst, TS4Rent, OF, X, …" — so Quinn knows the blast radius before the detail.
X2 — Dynamic Type
iOS Dynamic Type categories from XS to XXXL (the accessibility sizes AX1–AX5 are out of scope for P0; raise as nice-to-have).
X2a — Reflow rules
- All body text uses SwiftUI
Textwith semantic font tokens (visual-system §F2) — no fixed pt sizes anywhere in approval cards, drawers, chat bubbles. - Chat bubbles grow vertically; horizontal width remains capped at 85% of screen.
- Approval card vertical layout: title / stakes / body / confidence / actions — never side-by-side at L+; collapses to single column.
X2b — Card overflow rules
- Body preview truncates with ellipsis at Dynamic Type L if the card is over the card-height ceiling (260pt at default).
- At XXXL: body preview hidden entirely; "tap to read" affordance replaces it.
- Stakes badge always visible — never truncated. Confidence bar always visible — never truncated. Action affordances always visible — never collapsed behind an overflow menu.
X2c — Truncation strategies
- Editorial truncation (voice-aware): drop trailing modifiers first ("…tour-tease — confidence 0.83" → "…tour-tease") before clipping mid-word. Implementation hook in the renderer, fed by voice §V8 #6 read-aloud check.
- Surface name truncation: never truncate. Use the F5 glyph + 2-letter monogram fallback instead.
- Specialist name: never truncate; use the role label, not the full ID, when space is tight ("strategist" not
content-strategy-onlyfans).
X3 — Reduced Motion
When the user has Reduced Motion on (iOS Settings → Accessibility → Motion), the corpus respects it everywhere — not just chat.
X3a — No swipe-approve animations
- The 200ms easeOut sweep (visual-system §F4) is suppressed.
- Alternative affordance: a static "Approve" / "Reject" / "Edit" button row appears inline at the bottom of every approval card. One tap, no animation, no gesture. This row is always rendered on AT — sighted-Reduced-Motion users get it too, gesture users see it appear only when Reduced Motion is on.
- Haptic on threshold crossing (§F4) is suppressed; haptic on commit is suppressed.
X3b — No streaming-reply token reveal
- Streaming reply (A state 6) delivers as a single block when Reduced Motion is on. The progress indicator becomes a thin determinate bar (per §F6) instead of token-by-token fade-in.
- ai-copilot's reply still gets the chat-bubble enter, but with a fade (not slide).
X3c — No haptic on stakes-badge display
- The high-stakes notification haptic + display haptic (§F4) is suppressed when Reduced Motion is on (iOS folds haptics under the same pref by default; we honor it explicitly).
- Replacement: a single VoiceOver announcement "high stakes" when the card receives focus. Quiet to the room, not silent to the user.
X3d — Sheet present
- iOS native sheet timing already respects Reduced Motion automatically. No override needed. (§F4 notes don't-override; this is why.)
X4 — Color contrast
WCAG AA contrast minimums (4.5:1 body, 3:1 large text + UI) across light + dark.
X4a — Stakes palette
- Low (neutral-500): verified against neutral-50 (light bg) + neutral-950 (dark bg) — both clear.
- Medium (amber-500): verified light + dark.
- High (rose-600): verified light + dark; collision with destructive-reject red is a known open Q (visual-system VS-Q1) — until resolved, high-stakes uses rose-600 + filled chip + lock icon, destructive uses rose-600 + outlined chip + no icon. Two redundancies (icon + fill) so contrast loss doesn't break semantics.
X4b — Voice register palettes (if chromatic ships)
- Hearth (warm) / Working (neutral) / Plain (cool) registers — if visual-system later tints by register, each tint must pass AA against its background.
- Until then: register is signaled by typography + copy + density, not color. Color-blind users lose nothing.
X4c — Surface chip glyphs (F5)
- N1 brand-mark surfaces use their brand colors — most pass AA on white but fail on dark (OF blue on neutral-950 is fine; fansly green fails). Fix: outline ring at 1pt accent-rose in dark mode for any N1 chip that fails contrast on neutral-950. The outline is a contrast assistive, not decoration.
- N2 monogram tiles: foreground accent-rose on neutral-700 (light) / neutral-200 (dark) — verified AA. Status overlays (0.7 / 0.5 opacity) drop contrast below AA intentionally — pair every reduced-opacity state with a non-color cue (the amber dot, lock glyph, dashed ring already mandated in F5b).
X5 — Switch Control compatibility
Switch Control (iOS Settings → Accessibility → Switch Control) drives the UI via a single switch with sequential focus.
X5a — Focus order through cards
- Approval cards stack reading-order top-to-bottom; within a card: title → stakes badge → confidence bar → body preview → action row.
- A "+N more" collapsed card stack (A-Q2) must be expandable via a single switch tap; never gestures-only.
X5b — Escape pattern from drawers
- Every drawer (B) ships a top-left Close button as the FIRST focus stop. Switch users get out in one tap. No swipe-down-only dismiss anywhere.
- Modal drawers trap focus; Close returns focus to the opener element in the previous surface.
X5c — Kill switch reachable
- Per K §K5, the kill switch is reachable from the settings overlay quick-toggles (S §S9a). For Switch Control specifically: kill switch must be reachable in ≤3 sequential focus steps from any surface that has the top-bar visible.
- Step 1: focus the top-bar overflow.
- Step 2: tap; settings overlay opens with kill switch as first focused quick-toggle.
- Step 3: tap kill switch → confirmation sheet (which has its own AT contract: large target, plain-register copy, no double-confirm-via-gesture).
X5d — Custom rotor and AT actions
- Every card with swipe gestures (approval, audit row, blocklist entry, inbox row) registers equivalent AT actions consumed by Switch Control via the same SwiftUI accessibility-action API as VoiceOver rotor. One implementation, two consumers.
X6 — Voice-to-everything
Voice input is the universal accessibility primitive. Quinn — or a user who can't see, can't tap, or has hands full — can drive every primary surface by voice.
X6a — Mic input as universal input
- Push-to-talk (A state 4) + hands-free (state 5) cover ad-hoc commands.
- Approval actions reachable via voice: "approve", "reject", "edit", "defer" while focused on a card. Confirmation echoed by TTS in hearth register (for low/medium) or plain register (for high).
- Settings reachable via voice: "set silence window to 4 seconds", "enable advanced toggles" — gated identically to taps; the advanced gate (S §S11) still fires by voice.
X6b — Interruptible TTS (carries from D-Q2)
- TTS output is interruptible mid-utterance — Quinn says "stop" or "next" and the synth halts. Per voice §V4 + open Q voice-Q1, the TTS service supports this; the chat surface and notifications must honor it everywhere.
- Cancellation gestures via voice: "cancel", "undo", "go back".
X6c — Voice trigger word (carries from A-Q1)
- Optional trigger word ("Cocotte") gates hands-free wake-up. Off by default at P0 (battery + privacy); user opts in via S §S2.
- Reduced-motion + voice-trigger users: same trigger word is the only path; tap-to-talk is the failsafe.
X6d — TTS register shift
- Per voice §V4 / voice-Q1: TTS uses a single voice with prosodic shift (slower, lower pitch) on plain-register sentences. AT-only consideration: never speed up TTS — let users control speed via iOS Settings → Accessibility → VoiceOver speech rate (we honor the same rate for in-app TTS).
X7 — Cross-brief audits
Every other lettered brief in the corpus owes a short ## Accessibility mini-section. This brief is the index; the per-brief sections are the implementation. The audit:
- A (chat-surface): VO contract for chat bubbles + approval cards + streaming reply; reduced-motion fallback; voice-as-input claim. → X1a, X1b, X3b, X6.
- B (drawers): Escape pattern, focus trap, drawer title contract. → X1c, X5b.
- C (notifications): VO announcement copy for rich previews; high-stakes haptic suppression under Reduced Motion. → X3c, in-the-wild copy below.
- D (onboarding): TTS read-aloud for each interview question; tap + voice equivalence for chip selection. → X6.
- E (cross-platform): web + iPad + macOS AT parity; HTML semantics on G surfaces.
- F (visual-system): WCAG AA spot check on every token; Dynamic Type semantic fonts; motion + haptic prefs. → X2, X3, X4.
- G (web-surfaces): ARIA on every interactive element; no custom focus rings that hide system focus.
- H (recurring-chores): fan-out card VO enumeration; vacation toggle reachable in S9a quick-toggles. → X1e.
- I (audit-trust-replay): audit row VO contract; feedback affordances reachable via switch + voice. → X1d.
- K (safety-blocklist): kill switch reachability ≤3 switch taps; lock-glyph rule audibility. → X5c.
- L (specialists-fleet): specialist drawer escape + heading rotor; trust meter announces value, not just fill.
- M (error-degraded-modes): banner copy in plain register; never relies on red alone.
- N (provider-coop): coop intel report VO contract on
coop-intel-detailscreen. - O (surfaces-roster): chip glyph + monogram pairing; status overlay non-color cues. → X4c.
- P (inboxes): source-label rows; per-source policy editor switch-navigable.
- Q (vigil-journal-auto-conversations): vigil-close digest VO; auto-convo announcements.
- R (tours-events-hotels): tour leg cards; hotel-scout list switch-navigable.
- S (settings-ia): hosts the toggles for X — voice & input (S2), reduced-motion override (S2), audit shadow (S12).
- T (analytics-dashboard): chart alternatives — every chart ships a data-table AT view; anomaly chips announce as text.
- U (global-search): result list switch-navigable; voice query (X6); operator chips announce as button + value.
- V (data-portability-erasure): plain-register throughout; high-stakes confirms reachable via switch.
- W (org-overlay): org chip announces current scope; switcher reachable from S9a quick-toggles.
Each owner brief tracks its audit status in its own ## Accessibility mini-section. Missing section = brief is incomplete.
In-the-wild copy
VoiceOver labels and AT-only copy (register noted).
Stakes-badge VO label, low (plain — VO is informational, never editorial):
low stakes
Stakes-badge VO label, high (plain):
high stakes · double-tap for why
Approval card VO summary, OF post (plain — VO favors specificity):
approval card · onlyfans · content post · medium stakes · confidence 0.83 · title: tour tease three of four · double-tap to open, swipe up for actions
Reduced-motion alt-affordance label (working — these are visible buttons, not VO-only):
Approve · Edit · Reject · Defer
System-announcement-level copy, kill switch fired (plain — VO accessibilityAnnouncement):
Kill switch on. All auto-actions stopped. Tap anywhere to acknowledge.
System-announcement-level copy, vacation toggled via voice (hearth — same channel, register softens because low stakes):
Vacation on. Bumps paused.
Reduced-motion banner in S2 settings (working — explains the change):
Reduced motion is on. Swipe gestures are off; tap the action buttons under each card instead.
Voice command echo, "approve" (hearth):
Approved. Tucked in.
Voice command echo, "approve" on a high-stakes card (plain — register matches stakes):
Approved. High stakes — confirming in three seconds. Say cancel to undo.
Out of scope
- Third-party AT integration (Tobii eye-tracking, custom braille displays, Talon, etc.) — we honor iOS AT contracts; downstream third-party tools pick up the same hooks for free. No custom integration code.
- Custom AT runtime — no parallel screen-reader, no custom focus engine. Trust the platform.
- Localization of AT copy — resolved by Brief AD: VoiceOver narrates in provider's preferred language (AD1); register-faithful re-authoring per AD3; STT capability-gap is the only locale-naming moment (AD edge cases).
- AX1–AX5 (accessibility-size Dynamic Type) — P2; raise as nice-to-have. P0 stops at XXXL.
- Sign-language video alternatives to TTS — out of scope; voice-first product has hard limits here.
- Hardware switch certification — we honor iOS Switch Control's contract; specific hardware (AbleNet, etc.) is downstream.
Open questions
- X-Q1 Reduced-motion gate default — should Cocotte detect Reduced Motion at first-launch and also dial down ambient hearth-register flourish in copy (no em-dashes mid-sentence, fewer fragments)? Lean: no — voice register is orthogonal to motion. Reduced Motion suppresses animation only.
[blocking] - X-Q2 Large-text card-collapse strategy — at XXXL, should the body preview hide ("tap to read") or should the card grow vertically without cap and force the stack to scroll? Lean: hide preview, keep card height bounded; scrolling N tall cards is worse than tapping in.
[nice-to-have] - X-Q3 Voice-only mode — should there be a single toggle "voice-only mode" that combines Reduced Motion + Switch Control hints + TTS auto-respond + larger touch targets into one user-facing pref? Lean: yes, ship as S §S2 row "voice-only mode" — easier to discover than five separate iOS settings.
[blocking] - X-Q4 Audit-shadow visibility for AT prefs — if Quinn toggles "advanced reduced-motion" in S2, does it appear in S12 recently-changed? Lean: yes, accessibility prefs are settings, settings log to audit.
[nice-to-have] - X-Q5 Voice-trigger word availability when iOS Voice Control is also active — collision risk ("Cocotte, approve" vs Voice Control hearing "approve" as a system command). Lean: defer trigger word to push-to-talk when Voice Control is detected as active; document in S2.
[exploratory]
Related
- 00-system-voice §V8 — TTS read-aloud check, register prosody.
- 00-system-visual-system §F1 (contrast) · §F2 (Dynamic Type) · §F4 (motion + haptics).
- A — chat stream VO, streaming reply, voice trigger.
- B — drawer escape pattern.
- C — rich-preview VO, high-stakes haptic.
- D §D-Q2 — interruptible TTS source.
- E — cross-surface AT parity.
- G — web ARIA contract.
- H §H4 — fan-out card VO enumeration.
- I — audit row VO + feedback AT.
- K §K5 — kill switch reachability.
- M — degraded-mode banner copy.
- O — chip glyph + non-color status cues.
- S §S2 — hosts AT prefs (voice & input); §S12 — audit shadow.
- T — chart alternatives (data-table view).
- U — switch-navigable results, voice query.
- V — high-stakes confirms via switch.
- W — org chip VO contract.