Master plan tying the AI engine + mission layer + model + UI/UX + tuning
levers. Answers: layering (trained-stable vs mission-volatile -> mission
changes need NO retrain), the model (27B/52GB BF16 or ~16GB Q4, on the nyc2
volume+HF, CoT-thinking, generator accurate / classifier needs clean-data
LoRA), GPU-vs-CPU (GPU for batch+teacher, distill small for CPU+Q4 production),
the UI gaps (Mission Control, teach-loop wiring, tuning/interpretability
surface, eval dashboard, GPU buttons -- atop the existing ModelView/VoiceView/
AutopilotView/orchestrator scaffolds), and the levers (mission/canon/prompt/
temp/taxonomy/LoRA + the CoT trace as the explainability output).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the third prospector interface: an AI-facing streaming orchestrator MCP
that sits above the raw mcp-prospector adapter and alongside the operator PWA.
It ticks the system, reports status, detects items needing a human, raises
dedup'd nudges, and drives the GO/PAUSE/AWAY kill-switch — but never sends
(every send stays behind Gate-2 + the human_owned floor).
- @packages/mcp-orchestrator: package.json + tsconfig + src/{index,client,
nudges,types}.ts. Six tools (orchestrate_tick, status_report, list_nudges,
ack_nudge, set_mode, drive_loop) wired to the real REST surface; detection/
loop logic left as documented PENDING stubs. Typechecks + builds clean.
- docs/features/ai-orchestrator.md: responsibilities, tool surface, streaming/
nudge loop, nudge rules mapped to verified live endpoints, reuse + safety,
open decisions.
- register the package in the root workspaces.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bake-off harness in src/eval/ with Claude as offline labeler/judge/advisor
(never in the serving loop). Per-role scoring (classifier F1, generator
refusal+voice+policy+85% gate, orchestrator tool-call), replay harness to
fix Executor cycle-1's no-batch-replay blocker, researched candidate
roster (de-refused instruct base + Quinn-voice LoRA over heavy RP
fine-tunes). Reuses outcomes.jsonl/gold-turnpairs/RUNNER-POLICY.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Thesis: enrich 10K+ history into per-turn {read, move, outcome, source}
records before training. Map the two producers (matcher=classify+retrieve,
agent=generate) to data routing; distill agent wins into the matcher
library as the cost/quality shortcut. LoRA per role on a transient
training droplet; multi-LoRA serving on the single inference droplet;
eval-gated build flip. Classifier first, generator second, orchestrator
never trained.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Distinguish three AI roles across two tiers: the orchestrator/chat agent
(Tier A, control-surface CLIENT, user-facing, presence-warmed) vs the
classifier + message-generator (Tier B, pipeline components the app CALLS,
queue-warmed). Plane-3 autonomy agent = same orchestrator, event-driven
entry point. Fix warm-up triggers to be role-specific.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Presence-driven auto-warm (confirm toast), live uptime cost meter,
pause=teardown to stop billing, GPU policy moved to settings config.
Corrects the cost model: DO bills for droplet existence, not inference.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three control planes (operation/observation/autonomy) + governance, OSS
model end to end. Sequences parity+observability -> governance -> autonomy.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The full coworker-replacement app is built in web/ (served by this repo's
NestJS backend), not the platform my/ surface. Document the six views and the
/prospector/* endpoints each uses, the shared-audit/human_owned/Life-opt-in
guarantees, and dev/build commands. Keep the platform my/ SSO vision as-is.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace manual doctl DB creation with the declarative IaC path: terraform apply
creates people+prospector DBs+users on the managed cluster; pull DB passwords +
host from terraform output into the service .envs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Probe found no local PG: droplet pgbouncer fronts a DO Managed cluster
(private-lilith-store-pg, holds live quinn). people+prospector are new DBs on
that cluster (additive); services connect direct to :25060 over SSL. Node 20
already installed on lime.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
black homelan is gone; prod target is the DO backend droplet (lilith-store-backend,
209.38.51.98 / wg 10.9.0.5) where mac-sync-server already runs. Fix black:2546x
DB-host refs in comments/migrations. GPU is on-demand + queue-driven: hold warm
while backlog is deep, release on idle grace (not strictly per-tick).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Document the draft engine direction: OSS uncensored models on DO GPU droplets
(reuse LPv2 provisioning, no model-boss), engine id 'do-gpu-<model>_<build>', and
pastebin → CoT workflow builder (versioned reasoning chains the model runs;
pastebin canon as injected context; corrections as per-build tuning data). Rename
the MVP static-render engine value 'pastebin' -> 'template' (pastebin is now
context, not the engine).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restructure to match the @mac-sync sister-app convention + operator direction:
- Backend service -> repo root (root package = the app); engine is CORE source,
now src/engine/ (not a separate package); imports rewritten to relative.
- MCP server -> @packages/mcp-prospector/ (agent interface; thin REST wrapper so
the coworker can trial this backend and fall back to legacy quinn-prospector).
- web/ stays a top-level surface.
- draft_engine default 'claude:sonnet' -> 'pastebin': the whole point is to run
OFF hosted Claude (which refuses adult-services copy) on OSS-uncensored LLMs on
raw GPU droplets; generative target is 'gpu:<model>'. Reuse LPv2's existing DO
GPU provisioning, not model-boss.
- docs/features/mcp.md: how the MCP works + the coworker graceful-switch protocol.
- .gitignore: ignore Swift .build/.
Verified: tsc clean, 101 tests (92 engine + 9 runner), app boots from root,
mcp-prospector builds + boots, smoke tests green (scam held, settings=pastebin).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>