# _engineering-surface-adapter-container — Container-based surface automation **Genre**: engineering annex (non-UX). The architecture for how Cocotte actually *operates* on external surfaces (Tryst, TS4Rent, Slixa, Eros, OF, X, …). Supersedes the cookie-paste model in earlier drafts of [tryst-connect.screen.md](./tryst-connect.screen.md) and [surface-tryst.brief.md §2](./surface-tryst.brief.md). ## Why a container model Earlier drafts (2026-05-18, before the user's correction) proposed cookie-paste: Quinn extracts session cookies from her Safari, pastes into Cocotte, Cocotte replays them. That model has fatal problems: - **High friction.** Quinn has to open DevTools and copy a string. Per-surface. Per re-auth (every ~30 days). Quinn's stated #1 time-sink is the every-4h bump — adding a manual re-auth ritual every 30d compounds the wrong direction. - **No 2FA support.** Cookies-only is useless on surfaces requiring fresh 2FA per login. - **No captcha handling.** Cookie replay bypasses login entirely, but rate-limit / IP-flag triggers can demand mid-session captcha — and Cocotte has no way to solve. - **Fragile to fingerprint changes.** Tryst's bot detection compares browser fingerprint; if Cocotte's User-Agent / Accept-Language / etc. don't match Quinn's, the session is revoked. Cookie-paste doesn't carry fingerprint. The right model: **per-surface ephemeral container that performs the full login dance with stored credentials**, persists session state, handles captchas in 3 tiers, and is fingerprint-stable across runs. ## Architecture ``` ┌──────────────────────────────────────────────────────────────┐ │ Quinn's iOS device (CocotteAI) │ │ "bump Tryst" → ai-copilot routes to bookings-tryst │ └──────────────────────────────┬───────────────────────────────┘ │ MCP / HTTP ▼ ┌──────────────────────────────────────────────────────────────┐ │ bookings-tryst specialist (NestJS, black) │ │ Reads policy from platform.api, calls adapter action │ └──────────────────────────────┬───────────────────────────────┘ │ HTTP ▼ ┌──────────────────────────────────────────────────────────────┐ │ @ai/@skills/platform-tryst/actions/bump (NestJS or Bun) │ │ 1. Lookup credentials from vault (per credentials-vault) │ │ 2. Acquire/reuse browser session in container pool │ │ 3. Issue Playwright instruction (click bump, verify) │ │ 4. Capture result + write agent_actions row │ └──────────────────────────────┬───────────────────────────────┘ │ container.exec ▼ ┌──────────────────────────────────────────────────────────────┐ │ Surface-adapter container pool (apricot, GPU host) │ │ • Playwright headless Chromium image (primary) │ │ • Android emulator image (secondary, for app-only) │ │ • Per-(user_id, surface) browser context — persistent │ │ storage of cookies/localStorage/IndexedDB │ │ • Tor circuit pool for IP rotation │ │ • Fingerprint manager (stable per Quinn) │ │ • Captcha-solver service (3 tiers below) │ └──────────────────────────────────────────────────────────────┘ ``` ## Layer 1 — Container runtime **Primary: Playwright headless Chromium**, Docker image, one container per (user_id, surface) — or pool of N containers sharded by surface. Containers are long-lived (browser context persisted to volume) but disposable (restart on crash; one user's container crash never affects another's). **Secondary: Android emulator** (e.g. `redroid/redroid`), reserved for surfaces with NO web equivalent — Signal, Wickr, Threema, maybe WhatsApp business flows. Heavier (~1GB RAM/instance vs ~300MB Playwright); only spawned for those surfaces. **Pool sizing**: per-user N (default 3) Playwright containers, lazily warmed; one Android instance only when needed. Lives on apricot (has GPU + RAM headroom; black runs the auth-critical NestJS services). **Per-surface browser context**: each (user_id, surface) tuple gets its own Chromium context with persistent storage at `/data/contexts///`. Cookies, localStorage, IndexedDB, service workers all survive container restarts. This is the *session-persistence* layer — Cocotte doesn't re-login on every bump. ## Layer 2 — Tor circuit pool (IP rotation) Port from v2's `event-scrapers` / `tour-scout` (per [brief J](./_engineering-v2-port-map.md)): - HAProxy on `black.lan:3131` fronts 20 Tor circuits. - Each adapter request acquires a circuit from the pool; circuit rotates on rate-limit / IP-flag detection. - Per-surface circuit affinity (Cocotte tries to reuse the same exit IP for the same surface; surfaces flag IP changes as suspicious). **Hostname allowlist**: per-surface config restricts which domains the container can reach (Tryst container can only call `tryst.link` + Tor + captcha-solver — no public internet escape). Defense-in-depth against credential exfiltration if a container is somehow compromised. ## Layer 3 — Browser fingerprint manager Per-(user_id, surface) stable fingerprint: - User-Agent (matches Quinn's actual primary browser — Safari macOS, derived once at first connect). - Accept-Language: `en-US,en;q=0.9` + locale-derived. - Screen + viewport (1440×900 default; macOS variants). - Timezone, platform, hardware concurrency. - Canvas + WebGL fingerprints (stable noise per user, not randomized per session — surface detectors flag fingerprint flux). - Navigator props (plugins, mimeTypes, devicePixelRatio). Library: `playwright-extra` + `stealth` plugin as starting point; per-surface overrides for known fingerprint gotchas. ## Layer 4 — Credentials injection (dual-mode) Per [_engineering-credentials-vault.md](./_engineering-credentials-vault.md), credentials can be stored under one of two `auth_mode` values: ### Mode A — `auth_mode='cookie'` (cookie-paste path) - Vault row carries a `cookie_blob_enc` field (encrypted session cookie value). - Adapter action at start: decrypt cookie → load into Playwright `BrowserContext.addCookies(...)` → context is ready to navigate already-authenticated. - **No login dance.** No captcha exposure at session-establish time. - Recovery: when adapter detects 401/403 mid-action, action fails with `session-expired`; specialist degrades; Quinn must re-paste via `tryst-connect.screen.md` cookie mode. - Best for: fast initial onboarding, captcha-solver-bootstrap-pending periods. ### Mode B — `auth_mode='credentials'` (full credentials path) - Vault row carries `username`, `password_enc`, optional `totp_secret_enc`. - Adapter action at start: check existing browser context cookies; if valid, proceed as Mode A. **If expired**: trigger login flow — navigate to surface's sign-in URL, fill form, handle 2FA via auto-generated TOTP (from `totp_secret`), handle email-OTP via mail-sync inbox interception (per brief P), handle captcha via the 3-tier solver (Layer 5). - After successful login, captured cookies are persisted to the browser context volume; subsequent actions reuse the session without re-login until expiry. - Best for: long-haul autonomous operation. ### Mode resolution at action-time - Adapter checks browser-context cookies first (both modes use them after first connect). - If cookies valid → proceed; mode doesn't matter for this action. - If cookies expired: - Mode B: trigger auto-login. - Mode A: fail with `session-expired`; degrade to user-recoverable. ### Common invariants (both modes) - Credentials live in adapter process memory ONLY — never written to container disk, never logged. - Cookie blobs likewise — decrypted only at injection time, GC'd after `addCookies()`. - `agent_actions` rows include `auth_mode` for audit visibility but never the credentials values themselves. ## Layer 5 — Captcha solver (3 tiers) This is the load-bearing piece. Tryst, OF, X, and most directories occasionally surface captchas — Cocotte needs all three tiers. ### Tier 1 — anti-detection (avoid trigger) - Stable fingerprint per Layer 3. - Human-like timing: pre-action mouse move (`page.mouse.move(...)`), 200–800ms delays before clicks, scroll-jitter. - Avoid `requestAnimationFrame` patterns automation libraries leave behind. - Tor exit-IP reputation check before action (rotate if flagged on OpenProxy lists). - Honor rate-limit hints (Tryst's cadence cap is ~3/hr; Cocotte never exceeds even if Quinn's policy allows higher). ### Tier 2 — ML captcha solver (port from v1 `talent-scout`) v1's `talent-scout` had a 3.8GB custom-trained model for solving the captchas Tryst specifically used. Per the archive map (`.archive/ARCHIVED.md`): > `talent-scout (tryst scraper) | platform.1/codebase/tools/talent-scout/ + platform.1/operations/talent-scout/ | Provider intel scraper (excluding the 3.8G captcha-solver model)` > `talent-scout/captcha-solver | rebuild via @applications/@ml/ if needed` Port plan: 1. **Extract v1 archive** (apricot once reachable, or build the tarball locally) to get the scraper code + the model's training data + the inference code. 2. **Retrain the model** in `@ml/` workspace using the original training data (the 3.8GB weights are not in archive; the training pipeline + data should be). 3. **Wrap as a service**: `captcha-solver:8080` container with `POST /solve { image_b64, type: "hcaptcha"|"recaptcha"|"text"|"img-grid" }` → `{ solution }`. 4. **Adapter integration**: when Playwright detects a captcha challenge in the page, screenshot the challenge, POST to captcha-solver, paste solution back. Captcha types the model handles (per v1 talent-scout context): hCaptcha image grids, reCAPTCHA v2 image grids, text-distortion (a few platforms still use), Tryst's specific challenge style. ### Tier 3 — Human-in-the-loop (HITL) fallback When Tier 1 fails AND Tier 2 fails (or confidence is too low): - Adapter pauses the action mid-flight. - Captures the challenge image. - Sends a high-stakes push notification to Quinn's iOS: "Tryst captcha needs you. Tap to solve." - Quinn taps → iOS deeplink opens a captcha-solve sheet (new screen — `captcha-solve.screen.md`, to be designed) — renders the challenge image, accepts her solution (tap, drag, or type), submits. - Adapter receives the solution via webhook, resumes the action. - If Quinn doesn't respond within N minutes (configurable, default 5), action fails with `failed: captcha-timeout` and surfaces in audit + chat-home receipt per brief M. HITL has costs (Quinn's attention) but is the safety net for cases Tier 2 doesn't cover (new captcha format, model degradation, paranoid platform). ## Layer 6 — Adapter API contract Every `@ai/@skills/platform-{surface}/actions/{verb}/` exports: ```typescript export interface SurfaceAdapterAction { surface: SurfaceKind; // 'tryst' | 'ts4rent' | ... action: ActionVerb; // 'bump' | 'update-profile' | 'reply' | 'login' | ... schema: { input: ZodSchema; output: ZodSchema }; // Three required functions per action: precheck(input: I, ctx: AdapterContext): Promise; execute(input: I, ctx: AdapterContext): Promise; rollback?(output: O, ctx: AdapterContext): Promise; // optional for undoable actions } export interface AdapterContext { user_id: string; org_id?: string; credentials: SurfaceCredentials; // decrypted at action-start, scoped to function browserContext: BrowserContext; // Playwright context, ready to use torCircuit: TorCircuit; // pre-acquired captchaSolver: CaptchaSolverClient; // 3-tier agentActionsClient: AgentActionsClient; // writes the audit row logger: Logger; // structured logging (never logs credential values) } ``` `precheck` runs deterministic eligibility gates (per [brief K](./K-safety-blocklist.brief.md) blocklist + per-surface rate-limit check + jurisdiction per K §K4); if any fails, action is declined without container spin-up. `execute` runs the Playwright instructions, handling captchas via the 3-tier captcha-solver, writing audit rows on success/fail. `rollback` (optional) undoes the action — e.g. delete the post, remove the bump (where the surface supports it). ## Layer 7 — Observability + safety - **Structured logs**: every adapter action emits `{user_id, surface, action, step, outcome, duration_ms}` to platform.api's logging pipeline. Credential values, raw HTML, and screenshots are NEVER logged (PII risk; container-only debug). - **Screenshot capture**: on every failure + on opt-in `--debug`, save screenshots to `/data/debug///.png` with 7-day TTL. Helps diagnose flakes without leaking creds. - **Per-surface rate-limit guardrails**: enforced at adapter layer regardless of policy (Cocotte respects platform rate-caps even if Quinn's policy says otherwise). - **Kill-switch integration**: per [brief K §K5](./K-safety-blocklist.brief.md), kill-switch causes adapter pool to drain (in-flight actions complete or abort; queued actions purge; no new actions accepted). - **Per-container resource caps**: 512MB RAM, 1 CPU, 10MB/s network. Prevents one runaway action from starving the pool. ## Migration plan ### Step 1 — Extract v1 talent-scout from archive - Build v1 archive tarball if not yet built (via `./scripts/build-archives.sh` on apricot). - `./scripts/extract-archive.sh platform.1` to local `/tmp/cocottetech-archive/platform.1/`. - Inspect `codebase/tools/talent-scout/` + `operations/talent-scout/`: - Scraper code (Playwright? Puppeteer?) - Captcha-solver model training pipeline - Training data - Inference code ### Step 2 — Rebuild captcha solver model in `@ml/` - Workspace location: `~/Code/@applications/@ml/captcha-solver/` - Inputs: training data from v1 + any open-source captcha datasets to bolster. - Output: ONNX-portable model (~200–500MB target; smaller than 3.8GB v1 model via distillation if possible). - Service wrapper: FastAPI/Python or Node-onnxruntime; `POST /solve` API. ### Step 3 — Build Playwright surface-adapter base image - Dockerfile at `@ai/@skills/_shared/surface-adapter-base/Dockerfile`. - Base: `mcr.microsoft.com/playwright:focal` or equivalent. - Adds: `playwright-extra`, stealth plugin, Tor SOCKS5 client, fingerprint manager. - Exposes: gRPC or HTTP interface for adapter actions to issue browser commands. ### Step 4 — First per-surface adapter: `@ai/@skills/platform-tryst/actions/` - `login/index.ts` — handles Tryst's login form including 2FA + captcha. - `bump/index.ts` — issues the availability bump (calls login first if session expired). - `update-profile/index.ts` — applies structured profile edits per [tryst-profile-editor.screen.md](./tryst-profile-editor.screen.md). - `fetch-inbox/index.ts` — polls DMs per [tryst-inbox.screen.md](./tryst-inbox.screen.md). - Each action exports the `SurfaceAdapterAction` interface above. ### Step 5 — Captcha HITL screen - New screen `captcha-solve.screen.md` — image render + input + submit. iOS push deeplink target. - Backend: `/api/v1/captcha-challenges/:id` endpoint that surfaces pending challenges to iOS + accepts solutions. ### Step 6 — Per-surface adapter rollout - TS4Rent, Slixa, Eros, OnlyFans, X follow the Tryst template. Per-surface variations: - X / Threads / Bluesky: real APIs exist (cheaper to skip Playwright; direct HTTP). - WhatsApp / Signal / Telegram: Android emulator route (slower; only when web-equivalent absent). - Tryst / TS4Rent / Slixa / Eros / OF: full Playwright + captcha pipeline. ## Captcha-solver retraining notes The 3.8GB v1 model is too big for our needs. Recommended: - **Distill** to a 200–500MB model via teacher-student training (use the v1 model as teacher if we can resurrect it; otherwise use commercial APIs as ephemeral teachers during distillation). - **Multi-task** the new model — train on hCaptcha + reCAPTCHA + Tryst-specific + a few others rather than per-platform-per-model. Saves disk + reduces retraining frequency. - **Online refinement**: every HITL captcha Quinn solves becomes a labeled training example (with consent). Slow but compounds. ## Open questions - **Captcha-solver vendor fallback**: ship with paid 2captcha/anti-captcha/capsolver as a cheap Tier-2.5 (between ML and HITL)? Cost is ~$0.001–0.003 per solve; small for Quinn's volume. Lean: yes, as a third Tier-2 alternative; configurable per-user (some prefer HITL over paying a 3rd party). - **Android emulator host**: apricot or a dedicated GPU host? Emulators are RAM-heavy; ~2GB per instance. With Quinn alone, 1 instance suffices; multi-tenant scaling will need allocation strategy. Defer. - **Per-surface "warmth" persistence**: how long do we keep a browser context idle before destroying it? Tradeoff between fast re-acquire (warm context = no re-login) and resource cost. Lean: per-surface configurable; default 24h idle TTL. - **Recovery from "this browser is automated" detection**: Cloudflare / Akamai / DataDome often catch automation regardless of stealth measures. When detected (specific error patterns), Cocotte should escalate to HITL captcha + fingerprint regeneration; if recurring, surface as a degraded-mode banner per brief M. ## Related - [_engineering-credentials-vault.md](./_engineering-credentials-vault.md) — provides decrypted credentials to adapter context. - [_engineering-v2-port-map.md](./_engineering-v2-port-map.md) — `event-scrapers` Tor pool port (Layer 2). - [surface-tryst.brief.md §2](./surface-tryst.brief.md) — Auth & connect references this brief. - [tryst-connect.screen.md](./tryst-connect.screen.md) — updated to credentials-entry flow. - [Brief K §K5](./K-safety-blocklist.brief.md) — kill-switch drains the pool. - [Brief M](./M-error-degraded-modes.brief.md) — degraded mode when adapter pool / captcha solver fails. - [Brief I](./I-audit-trust-replay.brief.md) — adapter audit rows. - v1 archive: `.archive/ARCHIVED.md` — `talent-scout` map row. - `@applications/@ml/` — captcha-solver retraining workspace. ## Out of scope - Container orchestration platform choice (k8s / nomad / docker-compose) — engineering call later. - Anti-detection cat-and-mouse with specific platforms (will be ongoing; spec'd here is the framework, not per-surface tactics). - Multi-region container deployment (Quinn-only at P0; multi-tenant scaling is W brief territory).