prospector/docs/features/draft-engine.md
Natalie 2df18b5358
Some checks failed
CI / verify (push) Failing after 40s
docs(prospector): retire model-boss for @prospector/ai-harness; add DRAFT mode + alignment gate
- Replace every model-boss/coordinator reference with the @prospector/ai-harness
  story (direct vLLM client + classify/draft/judge/orchestrate task registry +
  on-demand GPU lifecycle + CoT-workflow runner + cost meter) across ai-first-v4,
  draft-engine, model-eval-pipeline, and PROSPECTOR.md; GPU_INFERENCE_URL is the
  canonical inference contract. Note ai-harness is promotable to a shared @ct
  package (onlyfans carries a parallel src/engine/classifier.ts).
- Fix migration collision: ai-first-v4 actor-attribution renumbered 0007 -> 0016
  (0007_tasks.sql exists; tree at 0013).
- Add the three missing pieces from the plan: a formal DRAFT runner mode distinct
  from PAUSE + DRAFT->GO graduation (new control-modes.md); a runtime per-draft
  alignment gate (deterministic facts/policy + GPU judge; spec_conflict/
  policy_conflict holds) in draft-engine's pipeline; and the facts/mission config
  schema (src/specs/, 0014_specs.sql) in ai-system-plan §5 + draft-engine.
- Index control-modes.md and the ai-harness rename in features/README.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 11:06:30 -04:00

67 lines
8.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Draft engine — OSS-on-GPU + CoT workflow builder
## Why
Outbound (and rich classification) must run on **OSS *uncensored* LLMs** — never hosted Claude/OpenAI, which refuse adult-services copy. The models are tuned to Quinn's voice and the inbound/outbound task. **The GPU is ON-DEMAND, not a standing droplet** (no always-on GPU). Lifecycle is **queue-driven, not strictly per-tick:** provision on demand → warm the model → drain the work; **keep it warm while the queue stays deep** (a big classify/draft backlog amortizes the provision cost — don't thrash provisioning) → **release when the queue drains / goes idle** past a short grace window. So: hold across ticks when busy, never hold when idle. **Reuse LPv2's existing on-demand DO GPU work** (`lilith-platform.live/scripts/provision-raw-gpu-droplet.sh` + its vLLM/OpenAI-compatible inference) behind the **`@prospector/ai-harness`** package — prospector's self-hosted inference layer: a **direct vLLM client** + a **task registry** for `classify` / `draft` / `judge` / `orchestrate`, the on-demand **GPU lifecycle**, the **prompt / CoT-workflow runner**, and the **cost meter**. ai-harness talks to the droplet's inference endpoint directly over `GPU_INFERENCE_URL` (the canonical contract — see step 2 below); it does **not** depend on the retired **model-boss** coordinator.
> _ai-harness is prospector-local today but **promotable to a shared `@ct` package** — `@applications/onlyfans` already carries a parallel `src/engine/classifier.ts` of the same inference-layer shape, so the direct-vLLM client + task registry + GPU lifecycle + cost meter want to live once and be consumed by both._
## Engine identifier convention
`prospector_settings.draft_engine` names the active engine:
- `do-gpu-<modelname>_<version_build>` — a specific OSS model + build on a DO GPU droplet, reached over its inference HTTP endpoint. Example: `do-gpu-mistral-nemo-12b-uncensored_2026-06-29.1`. The `<version_build>` pins the exact tuned weights so a decision is always traceable to the model that made it.
- `template` — the MVP/fallback "static" engine: copy rendered from the 🌹 pastebin note, **no LLM** (zero-dependency, always-safe baseline; what ships first).
The runner records the engine id on every draft/decision (audit trail) so the trial can attribute behavior to a model build, and corrections can be bucketed per build for tuning.
## From pastebin → CoT workflow builder
Today (MVP) the pastebin is a flat lookup: `templateKey (①…⓱) → fixed string`. That's the `template` engine. It is being **converted into a CoT (chain-of-thought) workflow builder**:
- A **workflow** is a versioned, ordered chain the OSS model runs to *generate* the reply for a situation (archetype × state × templateKey), instead of emitting canned text:
```
workflow {
id, name, version,
appliesTo: { archetypes[], states[], templateKeys[] },
context: [ pastebin canon snippets, Quinn voice rules, safety rules ], // the 🌹 note becomes injected context, not the output
steps: [ { think: "read the thread + atoms; what does he actually want?" },
{ think: "pick the move per Quinn's rules; check Gate-2 constraints" },
{ draft: "write Quinn's reply in her voice, ≤N chars, no service/price in writing" } ],
engine: "do-gpu-<model>_<build>"
}
```
- The **builder** lets the operator/coworker author + version these workflows (and A/B them per build). Corrections (`prospect_corrections`) feed back as few-shot/tuning data keyed to the workflow + engine build.
- Pastebin canon (voice, lines, rules) moves from "the output" to "context the workflow injects" — single source of truth for voice, now consumed by the model rather than sent verbatim.
## Pipeline placement (unchanged contract)
The runner stays the same; only the body-generation step swaps:
```
inbound → classify → decideNextAction(templateKey) → Gate-2 → mode
→ [DRAFT ENGINE] render body:
template engine → pastebin[templateKey] (MVP)
do-gpu engine → run CoT workflow(appliesTo) on the model (target)
→ [ALIGNMENT GATE] validate body vs current facts/policy (new)
→ dispatch to macsync (GO) · stage pending_review (DRAFT) · hold
```
Fail-safe is preserved: if the engine yields no usable body (model down, no matching workflow), the candidate **holds** — never a placeholder send. Mode selects the terminal step — auto-dispatch (GO) vs stage for operator review (DRAFT) vs hold (PAUSE); see [`control-modes.md`](./control-modes.md).
### Runtime alignment gate (per-draft, new)
Eval ([`model-eval-pipeline.md`](./model-eval-pipeline.md)) and the eval-gated build flip prove a **build** is safe *on average* on a held-out set. They do **not** prove that *this* freshly generated body didn't drift — a warm, passing model can still quote a stale rate, put service-for-money in writing, propose a call on a `text_only` thread, or name a city not on the current trip. The alignment gate is the per-draft check that closes that gap, inserted **between generate and dispatch** in `advanceDraft`:
- **Deterministic pass** — `src/engine/draft-align.ts` (pure + co-located test): `validateDraft(body, facts) → { ok, violations }`. Checks price contradicting `rate`, service/price-in-writing (the existing `service_in_writing` / `pay_after` posture promoted to an **outbound** check), channel proposals vs `text_only`, a location not in facts, `never_offer` phrases, plus the voice hard-rules already in `gpu/prompts.ts`.
- **GPU judge pass** (additive) — `judgeDraftAlignment` runs ai-harness's `judge` task on the same droplet; **null when the GPU is down** (same fail-soft as `draftReply`), so it only ever *adds* holds, never blocks on model availability.
- **Outcome** — a clean draft proceeds (dispatch under GO, stage `pending_review` under DRAFT). A violation **holds** with a structured `HoldReason`: `spec_conflict` (facts) or `policy_conflict` (policy), extending the `HoldReason` union from [`ai-first-v4.md`](./ai-first-v4.md) Phase 2.2. Bounded **revise**: the do-gpu engine may re-draft ≤2× with the violations injected; on continued failure it downgrades to the safe `template` body or HOLD. A contradicting draft is **never** presented or sent.
### Facts / mission config — the alignment oracle
The alignment gate validates against a structured **facts** config, and the *same* config is injected into the generator prompt — **one source, two consumers** (generator input + gate oracle). Split by rate-of-change (see [`ai-system-plan.md`](./ai-system-plan.md) §1): **facts** (rate, min duration, incall/outcall + location/window, channel flags e.g. `text_only`, OnlyFans handle, the `never_offer` list) change rarely → a stable config table in a new `src/specs/` module (mirroring `src/markets/`), migration `0014_specs.sql`; **mission** (today's goal/slots/market) is the daily runtime dial (Mission Control, `ai-system-plan.md` gap #1) and can start as a settings field. The generator reads it to *steer*; the alignment gate reads it to *verify the body didn't drift from it*.
## Build path
1. **MVP (done):** `template` engine = pastebin static render; fail-safe holds.
2. **GPU client:** a `GpuDraftEngine` that POSTs rendered CoT prompts (the whole batch) to `do-gpu-<model>_<build>`'s inference endpoint (env `GPU_INFERENCE_URL`), behind an **on-demand GPU lifecycle manager**: provision when work arrives, keep warm while the queue stays deep, release after a short idle grace. Selected when `draft_engine` matches `do-gpu-*`.
3. **Workflow store + builder:** persist workflows (own DB, versioned) + author/edit surface (panel or config) + bind to engine builds; wire corrections as per-build tuning data.
4. **Tuning loop:** export `prospect_corrections` per engine build → fine-tune/eval → publish next `do-gpu-<model>_<build>` → flip `draft_engine`.
## Open scope (confirm before building step 3)
- Workflow authoring: DB-backed builder UI in the panel, vs config files in-repo, vs both?
- Granularity: one workflow per templateKey, or per (archetype × state)?
- Does classification also move to a CoT workflow on the GPU, or stay fast-rules + atom-extraction?