cocottetech/@platform/codebase/@features/ai-copilot/docs/AD-multilingual-opaque.brief.md
natalie 1b719e1fd7 chore(bootstrap): initial V4 commit
Clean successor to V3 (forge: lilith/atlilith). Seeded from local Mac
working tree at ~/Code/@projects/@cocottetech/. node_modules and build
artifacts excluded via .gitignore.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 08:11:41 -07:00

19 KiB
Raw Blame History

AD — Multilingual (opaque to providers + clients)

Goal

CocotteAI is multilingual by default and invisibly so. A provider writes and reads in their preferred language; a prospect/client receives and replies in theirs. Neither party ever sees the translation layer — no toggles, no "translated from…" footnotes, no language-picker chrome. The product behaves as if everyone shares one language, while the system maintains canonical + per-locale renderings underneath. This brief locks the invariants so the multilingual layer can ship from P0 forward-compat (storage shape, audit shape) without re-architecture later.

Designer skim

  • Headline UX: Quinn writes "tell Tryst I'm in Berlin next week" in English; the surface action publishes German + English variants per Tryst's locale rules. A prospect DMs in Spanish; Quinn sees the message in English; triage drafts a Spanish reply; Quinn approves it without realizing she's reviewing a translation. Audit (brief I) records all three renderings (original / canonical / provider-view) but the chat surface only ever shows the provider's language.
  • Sections (8): AD1 provider-language detection + sticky · AD2 prospect-language detection + sticky · AD3 voice register preservation across translation · AD4 per-locale banned-phrase enforcement · AD5 cross-script K3 PII safety · AD6 storage triad (original / canonical / provider-view) · AD7 translation-confidence + fallback · AD8 approval-card behavior under language conflict.
  • Foundation: AD inherits from 00-system-voice (register gradient), K (safety/PII), I (audit append-only), L (specialist roles do the routing), X (accessibility — locale + AT copy interactions). AD does not create new UI chrome.
  • Voice: this brief = working register. The product itself is whatever-language-the-reader-wants register.
  • Blocking Qs: AD-Q1 provider language source-of-truth, AD-Q2 fallback when translation confidence is low.

Constraints

  • Opacity is mandatory. No "🌐", no language pill, no "translated by…" annotation, no toggle. If a UI element communicates "this was translated," the spec has failed.
  • Voice register survives translation. Hearth in German ≠ literal-translated English hearth. Each locale gets its own copy bank for the three registers (hearth / working / plain) — translation is register-aware re-authoring, not word-for-word. This is more like internationalization-as-rewrite than i18n strings.
  • Banned-phrase list per voice §V6 is per-locale, not global. "algorithm" is banned in English; the German equivalent may be a different banned word. Each locale maintains its own banned-phrase set (per voice §V6) and a cross-locale equivalence map for K3 enforcement.
  • K3 safety rules apply across scripts. Govt-name detection (K3c-1), hotel-address detection (K3f-2), channel-vs-surface separation (K3h) must hold regardless of script (Latin, Cyrillic, Hangul, Han, Devanagari). PII detection runs on the canonical rendering AND the per-locale rendering — leak in either fails the gate.
  • Audit triad is append-only. Every cross-language exchange records (original / canonical-EN / provider-view) per outbound + inbound message. Erasure (brief V) redacts all three; never deletes.
  • No live machine-translation roundtrip in the chat surface. Translation happens at draft time (outbound) and at ingest time (inbound) — the chat surface only ever renders pre-translated text. Latency budget: ≤ 800ms p95 for draft translation; ≤ 2s p95 for ingest translation.
  • Provider language is sticky per provider. It survives device handoff (brief E + G), follows the org (brief W) when scoped to org, falls back to personal when in personal-only scope.
  • Prospect language is sticky per prospect. Once detected, all subsequent drafts use that language unless prospect explicitly switches (brief L prospect-resolver updates the sticky preference).
  • @model-boss (apricot, GPU host) is the translation router. All translation calls go through @model-boss's /translate endpoint; never load translation models locally per project CLAUDE.md.

States to design

  • Provider's first-run language detection (D persona-seed): detected from device locale + first-utterance language ID; explicit confirmation only if mismatch is high-confidence.
  • Provider language change (mid-life): conversational ("write to me in German from now on") → all future renderings switch; audit captures the switch point; no UI banner.
  • Prospect's first inbound in a new language: ingested, translated to provider-view; prospects row updates sticky preference; no UI banner.
  • Provider drafts in language A, prospect's sticky is language B → outbound published in B; provider's chat shows A; both stored in audit triad.
  • Translation confidence below floor (AD-Q2): triage-drafted reply does NOT auto-send even if confidence-score otherwise would; escalates to approval card with explicit "Cocotte's draft is uncertain in {prospect-language}; please review" framing — note this framing is the only moment we surface the language layer, and it's framed as confidence not language.
  • Per-locale banned-phrase trigger (voice §V6 violation in target locale): draft re-drafts before delivery; if re-draft also violates, escalates per K phrase-blocklist pattern (brief K §K3i).
  • K3 PII detection across scripts: outbound never delivered; counter-action thread per K3c-1 / K3f-2 / K3h with opaque framing ("contains restricted content" — does not name the language).
  • Cross-device handoff (brief E + G) with mixed languages: provider switches from iOS to web mid-thread; provider-view language preserved; sync confirms via background URLSession.

AD1 — Provider language detection + sticky

Source-of-truth (AD-Q1 dependent):

  • Device locale (initial best-guess)
  • First-utterance language ID via @model-boss (≥ 0.9 confidence)
  • Persona row stores personas.language (single canonical preference)
  • Override via conversational instruction ("speak to me in Italian"), never via menu

Stickiness: persists across sessions, devices, org context switches (brief W) within scope. Falls back to device locale if personas row absent (only at first run).

Display: every Cocotte-side text — receipts, approval cards, audit-row meta, daily-digest, settings labels, push notifications — renders in provider's language. SF Symbols + iconography are language-neutral.

AD2 — Prospect language detection + sticky

Stored on prospects.preferred_language (new column, AD migration sketch in §schema). Detection ladder:

  1. Surface-supplied locale (OF account language, X profile lang, Telegram client lang, Tryst inquiry header — when present).
  2. First inbound message language ID via @model-boss.
  3. Fallback to surface default (per O — surface-tryst defaults to English, surface-onlyfans defaults to English unless prospect's profile says otherwise, etc.).

Stickiness: per prospect_id, persists across surfaces when prospect-resolver (brief L) links the prospect across surfaces. If a prospect switches language mid-thread, prospect-resolver updates sticky after 2 consecutive turns in the new language (not 1 — avoids ping-pong on a single emoji).

AD3 — Voice register preservation across translation

The 3-register voice (hearth / working / plain, per 00-system-voice §V2) is re-authored per locale, not translated:

  • Each locale ships a voice-{locale}.yaml register bank: example phrases per register, banned phrases, punctuation rules, register-shift signals.
  • The @model-boss /translate endpoint takes (text, source_locale, target_locale, register) and returns a register-aware rendering — not a literal translation.
  • For the hearth metaphor (culinary): each locale maps to a culturally-resonant equivalent (French → already invisible since "cocotte" is French; German → use Topf metaphor; Japanese → 鍋 / nabe; etc.). Per voice §V8, deferred metaphor localization is acknowledged here as resolved: AD3 is the resolution.
  • Failure mode: if @model-boss cannot supply a register-faithful rendering with confidence ≥ AD-Q2 floor, fallback per §AD7.

AD4 — Per-locale banned-phrase enforcement

Voice §V6 banned-phrase list is English-as-source-of-truth + per-locale equivalent set:

  • voice-{locale}.yaml ships a banned: [...] list (not a translation of the English list, an authored list — banned phrases are culturally-shaped, not literal).
  • Cross-locale equivalence: a registry maps "this English banned phrase" ↔ "these German equivalents" for K3i (phrase blocklist) enforcement.
  • Outbound text is checked in (a) the language it was drafted in and (b) the target locale post-translation. Violation in either fails delivery.

AD5 — Cross-script K3 PII safety

K3 hard rules (K3c-1 govt name, K3f-2 hotel address, K3h channel-vs-surface) must hold across scripts:

  • PII detection runs on canonical-EN rendering (for consistent regex/NER pipelines) AND on the target-locale rendering (catches transliteration leaks — e.g., a govt name in Cyrillic that the canonical-EN missed).
  • Hotel addresses normalize via a postal-address library (per locale) before comparison; a Berlin hotel in Korean transliteration still matches the blocked-address registry.
  • Channel-vs-surface separation: surface names (Tryst, OF, X) are proper nouns — not translated, just preserved. A draft that says "I'll DM you on Twitter" in German still contains the literal "Twitter" string; K3h check is locale-invariant.

AD6 — Storage triad (audit append-only)

Every cross-language exchange (outbound draft, inbound ingest) records three renderings in the audit (brief I):

  • audit.original_text — exact bytes as authored or received
  • audit.canonical_text — English rendering (single canonical for cross-locale comparison, K3 checks, analytics rollups)
  • audit.provider_view_text — what Quinn actually saw or wrote

For an inbound: original = prospect's language, canonical = EN, provider_view = Quinn's language. For an outbound: original = Quinn's language, canonical = EN, provider_view = Quinn's language (same as original when Quinn writes in her own language), and a fourth field audit.delivered_text = what the prospect actually received.

Erasure (brief V) redacts all four; never deletes. Append-only spine (per brief I) is preserved.

AD7 — Translation confidence + fallback

Every @model-boss /translate response includes a confidence score (0.01.0). Thresholds (AD-Q2):

  • ≥ 0.85: ship transparently (default path).
  • 0.650.85: ship, but the resulting approval card (if any) shows a soft cue without naming language — framed as draft uncertainty per F §F1 stakes language ("Cocotte is uncertain about wording — please review").
  • < 0.65: do NOT ship; escalate to provider as a re-draft prompt in provider's language; surface the prospect's original message verbatim with a plain-register cue: "I'm not confident I understood this. Want to try yourself?" Still no language naming.
  • Total translation failure (model-boss unreachable): per brief M §M2a, mark the affected specialist's draft path degraded; outbound queue holds; inbound is shown to provider in the original language verbatim with a plain-register cue: "I'm offline for this one — here's what came in."

The fallback never says "translated" or names a language; it speaks in confidence + presence terms.

AD8 — Approval card behavior under language conflict

Approval cards (per approval-card.screen) always render in provider's language — including the diff (Current vs Proposed), the "Why" line, and the action labels (Approve / Edit / Set aside).

Edge: provider edits the proposed text → edit is captured in provider's language → @model-boss re-translates Quinn's edit into target locale → re-translated text shown to the provider for approval-of-edit only IF translation confidence < AD-Q2 floor. Otherwise direct-send.

Multi-surface fan-out card (per H4 + multi-surface-fanout.screen) when prospects span 3 languages: the card shows one diff in provider's language; the system fans out per-prospect-language deliverables; the receipt that posts to chat-home aggregates as "Replied to 3 — done." No per-language breakdown surfaces.

Schema follow-up (sketch for migration 0005)

-- personas
ALTER TABLE personas ADD COLUMN preferred_language TEXT NOT NULL DEFAULT 'en';

-- prospects
ALTER TABLE prospects ADD COLUMN preferred_language TEXT NULL;
ALTER TABLE prospects ADD COLUMN preferred_language_confidence NUMERIC(3,2) NULL;
ALTER TABLE prospects ADD COLUMN preferred_language_detected_at TIMESTAMPTZ NULL;

-- agent_actions / audit (existing)
ALTER TABLE agent_actions ADD COLUMN canonical_text TEXT NULL;
ALTER TABLE agent_actions ADD COLUMN provider_view_text TEXT NULL;
ALTER TABLE agent_actions ADD COLUMN delivered_text TEXT NULL;
ALTER TABLE agent_actions ADD COLUMN source_locale TEXT NULL;
ALTER TABLE agent_actions ADD COLUMN target_locale TEXT NULL;
ALTER TABLE agent_actions ADD COLUMN translation_confidence NUMERIC(3,2) NULL;

Authoritative migration filename: @platform/infrastructure/sql/migrations/0005_multilingual_opaque.sql. RLS unchanged (per-user tenancy).

In-the-wild copy

  • (working, provider language change) Quinn: "speak to me in Italian from now on" / Cocotte: "Va bene. Ti scriverò così d'ora in poi." (No "language changed to Italian" banner.)
  • (plain, low confidence fallback) "I'm not confident I understood this. Want to try yourself?"
  • (plain, total model-boss outage) "I'm offline for this one — here's what came in." (followed by verbatim original)
  • (working, K3 leak across script) "Held a draft back — contains restricted content. See audit for the row." (does not name language)

Edge cases

  • Provider writes in mixed language (code-switching: English with German nouns) → @model-boss detects mixed → uses provider's personas.preferred_language as the canonical, preserves loanwords verbatim.
  • Prospect uses machine-translated message themselves (translation artifacts visible) → ingest still works; the canonical-EN rendering normalizes; the provider-view never exposes the source machine artifact.
  • Surface that doesn't support the prospect's language well (e.g., Tryst directory profile in Russian when Tryst is English-first market) → strategist surfaces a strategic approval card ("Profile audience is primarily English-speaking — keep RU section or trim?") — that's a strategy question, not a language-layer surfacing, so it's OK.
  • Right-to-left locale (Arabic, Hebrew) → mirror swipe semantics per chat-home edge case; layout mirroring inherits from system; voice-{ar}.yaml ships RTL-aware punctuation rules.
  • VoiceOver across languages (brief X) → VoiceOver narration uses provider's language (per AD1); prospect's voice-message transcripts are translated to provider's language before narration.
  • Cook-mode dictation (chat-home state 7) in provider's non-English language → STT model must match personas.preferred_language; if STT model unavailable for that locale, falls back to typed-only with a plain-register cue ("voice input isn't ready for {locale} yet — typing only for now") — this is the only place locale is named, and only as a capability gap, not a translation surfacing.
  • 00-system-voice §V2 register gradient · §V6 banned phrases · §V8 metaphor — AD3/AD4 resolve the deferred localization placeholder noted at voice.md L140/L146.
  • Brief K §K3 — PII rules; AD5 extends across scripts.
  • Brief I — append-only spine; AD6 adds the triad columns.
  • Brief Lprospect-resolver updates prospect sticky language.
  • Brief M §M2a — degraded-mode fallback when @model-boss unreachable.
  • Brief X §234 — resolves "Localization of AT copy" placeholder.
  • Brief AA §163 — AD does not unblock AA marketing localization (different problem; AA stays P3+).
  • Brief D — first-run language detection at persona seed.
  • Brief E + Brief G — cross-device language stickiness.
  • Brief V — erasure redacts all four audit fields.

Out of scope

  • Marketing site localization (brief AA stays English-at-launch; AD is in-product only).
  • Persona seed copy bank generation for new locales — AD3 specifies the shape; the per-locale voice-{locale}.yaml files themselves are authored work that lives in a follow-up.
  • STT/TTS model availability per locale — AD specifies behavior when unavailable; sourcing the models is a @model-boss provisioning problem on apricot, not a design problem.
  • Live machine-translation roundtrip in the chat surface — explicitly excluded (latency + UX).
  • User-visible language picker — never. This is the whole point of the brief.

Open questions

  • AD-Q1 Provider language source-of-truth — device locale + first-utterance ID + personas.preferred_language, with persona winning? Or persona-only and device-locale never auto-overrides? [blocking] (lean: persona wins; device locale only used at first-run when persona is empty).
  • AD-Q2 Translation confidence floor — should the soft-cue band be 0.650.85 or wider? And should re-draft loop have a max-retry before total escalation? [blocking] (lean: 0.65/0.85 as drafted; 2 retries max).
  • AD-Q3 Surface fan-out audit shape — when a card fans out to 3 prospects in 3 languages, should agent_actions write 1 row with 3 delivered_text entries or 3 separate rows linked by turn_id? [exploratory] (lean: 3 rows linked by turn_id, preserves append-only per-prospect lineage).
  • AD-Q4 RTL layout coverage — when does Arabic / Hebrew RTL ship? [nice-to-have] (lean: P5+; build forward-compat now).
  • AD-Q5 Cross-locale equivalence registry maintenance — who owns the voice-{locale}.yaml files long-term (Quinn, a translator, a specialist)? [exploratory].

Apricot-deferred verifications

Per [[feedback-apricot-unreachable]], this brief is pure authoring on local Mac — no apricot dependency. Verifications to run once apricot is reachable:

  • @model-boss /translate endpoint contract probe (verify it exists / what its actual response shape is) — AD3 + AD7 assume a confidence score is returned.
  • STT model availability per locale on apricot — AD7 cook-mode fallback assumes we can query this at runtime.
  • Per-locale tokenizer assumptions for K3 PII detection across scripts — AD5 assumes apricot's NLP pipeline has multilingual NER coverage.