20 KiB
_engineering-surface-adapter-container — Container-based surface automation
Genre: engineering annex (non-UX). The architecture for how Cocotte actually operates on external surfaces (Tryst, TS4Rent, Slixa, Eros, OF, X, …). Supersedes the cookie-paste model in earlier drafts of tryst-connect.screen.md and surface-tryst.brief.md §2.
Why a container model
Earlier drafts (2026-05-18, before the user's correction) proposed cookie-paste: Quinn extracts session cookies from her Safari, pastes into Cocotte, Cocotte replays them. That model has fatal problems:
- High friction. Quinn has to open DevTools and copy a string. Per-surface. Per re-auth (every ~30 days). Quinn's stated #1 time-sink is the 2–3h Tryst bump (tier-dependent per surface-tryst §canonical-facts) — adding a manual re-auth ritual every 30d compounds the wrong direction.
- No 2FA support. Cookies-only is useless on surfaces requiring fresh 2FA per login.
- No captcha handling. Cookie replay bypasses login entirely, but rate-limit / IP-flag triggers can demand mid-session captcha — and Cocotte has no way to solve.
- Fragile to fingerprint changes. Tryst's bot detection compares browser fingerprint; if Cocotte's User-Agent / Accept-Language / etc. don't match Quinn's, the session is revoked. Cookie-paste doesn't carry fingerprint.
The right model: per-surface ephemeral container that performs the full login dance with stored credentials, persists session state, handles captchas in 3 tiers, and is fingerprint-stable across runs.
Architecture
┌──────────────────────────────────────────────────────────────┐
│ Quinn's iOS device (CocotteAI) │
│ "bump Tryst" → ai-copilot routes to bookings-tryst │
└──────────────────────────────┬───────────────────────────────┘
│ MCP / HTTP
▼
┌──────────────────────────────────────────────────────────────┐
│ bookings-tryst specialist (NestJS, black) │
│ Reads policy from platform.api, calls adapter action │
└──────────────────────────────┬───────────────────────────────┘
│ HTTP
▼
┌──────────────────────────────────────────────────────────────┐
│ @features/bookings-tryst/adapter/bump (NestJS) │
│ 1. Lookup credentials from vault (per credentials-vault) │
│ 2. Acquire/reuse browser session in container pool │
│ 3. Issue Playwright instruction (click bump, verify) │
│ 4. Capture result + write agent_actions row │
└──────────────────────────────┬───────────────────────────────┘
│ container.exec
▼
┌──────────────────────────────────────────────────────────────┐
│ Surface-adapter container pool (apricot, GPU host) │
│ • Playwright headless Chromium image (primary) │
│ • Android emulator image (secondary, for app-only) │
│ • Per-(user_id, surface) browser context — persistent │
│ storage of cookies/localStorage/IndexedDB │
│ • Tor circuit pool for IP rotation │
│ • Fingerprint manager (stable per Quinn) │
│ • Captcha-solver service (3 tiers below) │
└──────────────────────────────────────────────────────────────┘
Layer 1 — Container runtime
Primary: Playwright headless Chromium, Docker image, one container per (user_id, surface) — or pool of N containers sharded by surface. Containers are long-lived (browser context persisted to volume) but disposable (restart on crash; one user's container crash never affects another's).
Secondary: Android emulator (e.g. redroid/redroid), reserved for surfaces with NO web equivalent — Signal, Wickr, Threema, maybe WhatsApp business flows. Heavier (~1GB RAM/instance vs ~300MB Playwright); only spawned for those surfaces.
Pool sizing: per-user N (default 3) Playwright containers, lazily warmed; one Android instance only when needed. Lives on apricot (has GPU + RAM headroom; black runs the auth-critical NestJS services).
Per-surface browser context: each (user_id, surface) tuple gets its own Chromium context with persistent storage at /data/contexts/<user_id>/<surface>/. Cookies, localStorage, IndexedDB, service workers all survive container restarts. This is the session-persistence layer — Cocotte doesn't re-login on every bump.
Layer 2 — Tor circuit pool (IP rotation)
Port from v2's event-scrapers / tour-scout (per brief J):
- HAProxy on
black.lan:3131fronts 20 Tor circuits. - Each adapter request acquires a circuit from the pool; circuit rotates on rate-limit / IP-flag detection.
- Per-surface circuit affinity (Cocotte tries to reuse the same exit IP for the same surface; surfaces flag IP changes as suspicious).
Hostname allowlist: per-surface config restricts which domains the container can reach (Tryst container can only call tryst.link + Tor + captcha-solver — no public internet escape). Defense-in-depth against credential exfiltration if a container is somehow compromised.
Layer 3 — Browser fingerprint manager
Per-(user_id, surface) stable fingerprint:
- User-Agent (matches Quinn's actual primary browser — Safari macOS, derived once at first connect).
- Accept-Language:
en-US,en;q=0.9+ locale-derived. - Screen + viewport (1440×900 default; macOS variants).
- Timezone, platform, hardware concurrency.
- Canvas + WebGL fingerprints (stable noise per user, not randomized per session — surface detectors flag fingerprint flux).
- Navigator props (plugins, mimeTypes, devicePixelRatio).
Library: playwright-extra + stealth plugin as starting point; per-surface overrides for known fingerprint gotchas.
Layer 4 — Credentials injection (dual-mode)
Per _engineering-credentials-vault.md, credentials can be stored under one of two auth_mode values:
Mode A — auth_mode='cookie' (cookie-paste path)
- Vault row carries a
cookie_blob_encfield (encrypted session cookie value). - Adapter action at start: decrypt cookie → load into Playwright
BrowserContext.addCookies(...)→ context is ready to navigate already-authenticated. - No login dance. No captcha exposure at session-establish time.
- Recovery: when adapter detects 401/403 mid-action, action fails with
session-expired; specialist degrades; Quinn must re-paste viatryst-connect.screen.mdcookie mode. - Best for: fast initial onboarding, captcha-solver-bootstrap-pending periods.
Mode B — auth_mode='credentials' (full credentials path)
- Vault row carries
username,password_enc, optionaltotp_secret_enc. - Adapter action at start: check existing browser context cookies; if valid, proceed as Mode A. If expired: trigger login flow — navigate to surface's sign-in URL, fill form, handle 2FA via auto-generated TOTP (from
totp_secret), handle email-OTP via mail-sync inbox interception (per brief P), handle captcha via the 3-tier solver (Layer 5). - After successful login, captured cookies are persisted to the browser context volume; subsequent actions reuse the session without re-login until expiry.
- Best for: long-haul autonomous operation.
Mode resolution at action-time
- Adapter checks browser-context cookies first (both modes use them after first connect).
- If cookies valid → proceed; mode doesn't matter for this action.
- If cookies expired:
- Mode B: trigger auto-login.
- Mode A: fail with
session-expired; degrade to user-recoverable.
Common invariants (both modes)
- Credentials live in adapter process memory ONLY — never written to container disk, never logged.
- Cookie blobs likewise — decrypted only at injection time, GC'd after
addCookies(). agent_actionsrows includeauth_modefor audit visibility but never the credentials values themselves.
Layer 5 — Captcha solver (3 tiers)
This is the load-bearing piece. Tryst, OF, X, and most directories occasionally surface captchas — Cocotte needs all three tiers.
Tier 1 — anti-detection (avoid trigger)
- Stable fingerprint per Layer 3.
- Human-like timing: pre-action mouse move (
page.mouse.move(...)), 200–800ms delays before clicks, scroll-jitter. - Avoid
requestAnimationFramepatterns automation libraries leave behind. - Tor exit-IP reputation check before action (rotate if flagged on OpenProxy lists).
- Honor rate-limit hints (Tryst's cadence cap is ~3/hr; Cocotte never exceeds even if Quinn's policy allows higher).
Tier 2 — ML captcha solver (port from v1 talent-scout)
v1's talent-scout had a 3.8GB custom-trained model for solving the captchas Tryst specifically used. Per the archive map (.archive/ARCHIVED.md):
talent-scout (tryst scraper) | platform.1/codebase/tools/talent-scout/ + platform.1/operations/talent-scout/ | Provider intel scraper (excluding the 3.8G captcha-solver model)talent-scout/captcha-solver | rebuild via @applications/@ml/ if needed
Port plan:
- Extract v1 archive (apricot once reachable, or build the tarball locally) to get the scraper code + the model's training data + the inference code.
- Retrain the model in
@ml/workspace using the original training data (the 3.8GB weights are not in archive; the training pipeline + data should be). - Wrap as a service:
captcha-solver:8080container withPOST /solve { image_b64, type: "hcaptcha"|"recaptcha"|"text"|"img-grid" }→{ solution }. - Adapter integration: when Playwright detects a captcha challenge in the page, screenshot the challenge, POST to captcha-solver, paste solution back.
Captcha types the model handles (per v1 talent-scout context): hCaptcha image grids, reCAPTCHA v2 image grids, text-distortion (a few platforms still use), Tryst's specific challenge style.
Tier 3 — Human-in-the-loop (HITL) fallback
When Tier 1 fails AND Tier 2 fails (or confidence is too low):
- Adapter pauses the action mid-flight.
- Captures the challenge image.
- Sends a high-stakes push notification to Quinn's iOS: "Tryst captcha needs you. Tap to solve."
- Quinn taps → iOS deeplink opens a captcha-solve sheet (new screen —
captcha-solve.screen.md, to be designed) — renders the challenge image, accepts her solution (tap, drag, or type), submits. - Adapter receives the solution via webhook, resumes the action.
- If Quinn doesn't respond within N minutes (configurable, default 5), action fails with
failed: captcha-timeoutand surfaces in audit + chat-home receipt per brief M.
HITL has costs (Quinn's attention) but is the safety net for cases Tier 2 doesn't cover (new captcha format, model degradation, paranoid platform).
Layer 6 — Adapter API contract
Every @cocottetech/@platform/codebase/@features/{bookings,content}-{surface}/adapter/{verb}/ exports:
export interface SurfaceAdapterAction<I, O> {
surface: SurfaceKind; // 'tryst' | 'ts4rent' | ...
action: ActionVerb; // 'bump' | 'update-profile' | 'reply' | 'login' | ...
schema: { input: ZodSchema<I>; output: ZodSchema<O> };
// Three required functions per action:
precheck(input: I, ctx: AdapterContext): Promise<PrecheckResult>;
execute(input: I, ctx: AdapterContext): Promise<O>;
rollback?(output: O, ctx: AdapterContext): Promise<void>; // optional for undoable actions
}
export interface AdapterContext {
user_id: string;
org_id?: string;
credentials: SurfaceCredentials; // decrypted at action-start, scoped to function
browserContext: BrowserContext; // Playwright context, ready to use
torCircuit: TorCircuit; // pre-acquired
captchaSolver: CaptchaSolverClient; // 3-tier
agentActionsClient: AgentActionsClient; // writes the audit row
logger: Logger; // structured logging (never logs credential values)
}
precheck runs deterministic eligibility gates (per brief K blocklist + per-surface rate-limit check + jurisdiction per K §K4); if any fails, action is declined without container spin-up.
execute runs the Playwright instructions, handling captchas via the 3-tier captcha-solver, writing audit rows on success/fail.
rollback (optional) undoes the action — e.g. delete the post, remove the bump (where the surface supports it).
Layer 7 — Observability + safety
- Structured logs: every adapter action emits
{user_id, surface, action, step, outcome, duration_ms}to platform.api's logging pipeline. Credential values, raw HTML, and screenshots are NEVER logged (PII risk; container-only debug). - Screenshot capture: on every failure + on opt-in
--debug, save screenshots to/data/debug/<user_id>/<surface>/<timestamp>.pngwith 7-day TTL. Helps diagnose flakes without leaking creds. - Per-surface rate-limit guardrails: enforced at adapter layer regardless of policy (Cocotte respects platform rate-caps even if Quinn's policy says otherwise).
- Kill-switch integration: per brief K §K5, kill-switch causes adapter pool to drain (in-flight actions complete or abort; queued actions purge; no new actions accepted).
- Per-container resource caps: 512MB RAM, 1 CPU, 10MB/s network. Prevents one runaway action from starving the pool.
Migration plan
Step 1 — Extract v1 talent-scout from archive
- Build v1 archive tarball if not yet built (via
./scripts/build-archives.shon apricot). ./scripts/extract-archive.sh platform.1to local/tmp/cocottetech-archive/platform.1/.- Inspect
codebase/tools/talent-scout/+operations/talent-scout/:- Scraper code (Playwright? Puppeteer?)
- Captcha-solver model training pipeline
- Training data
- Inference code
Step 2 — Rebuild captcha solver model in @ml/
- Workspace location:
~/Code/@applications/@ml/captcha-solver/ - Inputs: training data from v1 + any open-source captcha datasets to bolster.
- Output: ONNX-portable model (~200–500MB target; smaller than 3.8GB v1 model via distillation if possible).
- Service wrapper: FastAPI/Python or Node-onnxruntime;
POST /solveAPI.
Step 3 — Build Playwright surface-adapter base image
- Dockerfile at
@ai/@skills/_shared/surface-adapter-base/Dockerfile. - Base:
mcr.microsoft.com/playwright:focalor equivalent. - Adds:
playwright-extra, stealth plugin, Tor SOCKS5 client, fingerprint manager. - Exposes: gRPC or HTTP interface for adapter actions to issue browser commands.
Step 4 — First per-surface adapter: @cocottetech/@platform/codebase/@features/bookings-tryst/adapter/
login/index.ts— handles Tryst's login form including 2FA + captcha.bump/index.ts— issues the availability bump (calls login first if session expired).update-profile/index.ts— applies structured profile edits per tryst-profile-editor.screen.md.fetch-inbox/index.ts— polls DMs per tryst-inbox.screen.md.- Each action exports the
SurfaceAdapterActioninterface above.
Step 5 — Captcha HITL screen
- New screen
captcha-solve.screen.md— image render + input + submit. iOS push deeplink target. - Backend:
/api/v1/captcha-challenges/:idendpoint that surfaces pending challenges to iOS + accepts solutions.
Step 6 — Per-surface adapter rollout
- TS4Rent, Slixa, Eros, OnlyFans, X follow the Tryst template. Per-surface variations:
- X / Threads / Bluesky: real APIs exist (cheaper to skip Playwright; direct HTTP).
- WhatsApp / Signal / Telegram: Android emulator route (slower; only when web-equivalent absent).
- Tryst / TS4Rent / Slixa / Eros / OF: full Playwright + captcha pipeline.
Captcha-solver retraining notes
The 3.8GB v1 model is too big for our needs. Recommended:
- Distill to a 200–500MB model via teacher-student training (use the v1 model as teacher if we can resurrect it; otherwise use commercial APIs as ephemeral teachers during distillation).
- Multi-task the new model — train on hCaptcha + reCAPTCHA + Tryst-specific + a few others rather than per-platform-per-model. Saves disk + reduces retraining frequency.
- Online refinement: every HITL captcha Quinn solves becomes a labeled training example (with consent). Slow but compounds.
Open questions
- Captcha-solver vendor fallback: ship with paid 2captcha/anti-captcha/capsolver as a cheap Tier-2.5 (between ML and HITL)? Cost is ~$0.001–0.003 per solve; small for Quinn's volume. Lean: yes, as a third Tier-2 alternative; configurable per-user (some prefer HITL over paying a 3rd party).
- Android emulator host: apricot or a dedicated GPU host? Emulators are RAM-heavy; ~2GB per instance. With Quinn alone, 1 instance suffices; multi-tenant scaling will need allocation strategy. Defer.
- Per-surface "warmth" persistence: how long do we keep a browser context idle before destroying it? Tradeoff between fast re-acquire (warm context = no re-login) and resource cost. Lean: per-surface configurable; default 24h idle TTL.
- Recovery from "this browser is automated" detection: Cloudflare / Akamai / DataDome often catch automation regardless of stealth measures. When detected (specific error patterns), Cocotte should escalate to HITL captcha + fingerprint regeneration; if recurring, surface as a degraded-mode banner per brief M.
Related
- _engineering-credentials-vault.md — provides decrypted credentials to adapter context.
- _engineering-v2-port-map.md —
event-scrapersTor pool port (Layer 2). - surface-tryst.brief.md §2 — Auth & connect references this brief.
- tryst-connect.screen.md — updated to credentials-entry flow.
- Brief K §K5 — kill-switch drains the pool.
- Brief M — degraded mode when adapter pool / captcha solver fails.
- Brief I — adapter audit rows.
- v1 archive:
.archive/ARCHIVED.md—talent-scoutmap row. @applications/@ml/— captcha-solver retraining workspace.
Out of scope
- Container orchestration platform choice (k8s / nomad / docker-compose) — engineering call later.
- Anti-detection cat-and-mouse with specific platforms (will be ongoing; spec'd here is the framework, not per-surface tactics).
- Multi-region container deployment (Quinn-only at P0; multi-tenant scaling is W brief territory).