Commit graph

7 commits

Author SHA1 Message Date
Natalie
15e7348413 feat(box): OCR extraction + GPU (OpenAI-compatible) rating backend, env-selectable
Wire the on-box (Claude-API-less) path decided with the operator: EXTRACT_BACKEND=ocr
sends each screenshot to the on-box mrnumber-ocr service (raw text, no per-shot
structuring); build_rating_profile uses an OpenAI-compatible LLM on a DO GPU droplet
(RATING_LLM_URL) which extracts the reports from the raw OCR text AND produces the
multi-axis verdict. Reports are folded back into the history so the people-signal +
counts + safety flags reflect them; safety detection also scans the raw OCR lines so a
LE term forces cop_flag even before structuring. vision/Claude stays the plum-dev
default. +5 tests incl. full OCR→GPU→cop_flag flow. 33/33.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 00:39:06 -04:00
Natalie
60becf25ba feat(record): migrate verdict recording from quinn.api to the cocotte people service
Drop all old-Quinn coupling from the client path: instead of POSTing to
${QUINN_MY_URL}/api/clients/{id}/screening, record the verdict as a
`screening_mrnumber` person signal in the cocotte people service (persons DB),
keyed by the phone number (channel 'sms'; person auto-upserted). Verdict maps to
the bare valueText consumers switch on — law-enforcement critical flag → `cop_flag`,
else denied/approved/error; pending/not_found → unset (read as not_screened). Rich
record rides valueJsonb. `--client-id` → optional `--ref` (correlation id in
sourceHandle); env QUINN_MY_* → PEOPLE_*. 28/28 unit tests (wire body asserted).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 13:37:58 -04:00
Natalie
da45d1e60e feat(safety): promote critical caller signals (LE/violence/robbery/coercion) to top-level flags
The rating LLM folded the single most dangerous signal — a law-enforcement
sting ("Es policía") — into row 7 of a flat 14-item red_flags list and could
under-weight it. Add a deterministic, accent/language-folded taxonomy that
detects critical-safety categories straight from the report text, promotes them
to a top-level `safety_flags` array (icon + severity + evidence), surfaces them
ABOVE the rating in the log + wire body + MCP json, and applies a hard 'denied'
override that does not depend on the model scoring the safety axis correctly.
Folding matches Spanish "Es policía"; word boundaries avoid 'problem'/'copy'
false positives. +5 unit tests; verified live against the box (6315304426).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 13:17:39 -04:00
Natalie
f0c15a5ba7 refactor: thin onto shared redroid packages; box moves to @redroid app
- client/mr_lookup.py: MrNumberEmulator subclasses RedroidDevice; adb base, vision
  harness, and screening recorder now come from redroid_client (lilith-redroid-client).
  Only Mr. Number nav + rating profile remain. 19 tests green.
- mcp/index.ts: thin call to @lilith/redroid-mcp factory (logger.ts removed — shared).
- Remove cloud/ + deploy/deploy-droplet.sh: the redroid box is now owned by the
  @redroid app. install.sh pip-installs lilith-redroid-client from cocotte-forge PyPI.
- Manifest/CLAUDE: box ownership → @redroid; plum per-app console tunnel stays here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 15:05:33 -04:00
Natalie
dd6b63b4c0 fix(nav): clean-search reset + correct-caller verification (no stale data)
A fresh lookup was silently re-reading the PREVIOUS caller's page when the app
was left on an open detail/list. Two guards:
- go_to_search(): press BACK until the search field is present (relaunch as last
  resort) so every lookup starts from a clean search screen
- open_report_detail()/detail_state(): confirm we're on a report detail reached
  after that fresh search; abort with result='error' rather than rate a stale page
  (the number isn't always printed on the page — 'Personal Line' with no digits —
  so we trust 'detail page after clean search', and match digits when shown)

Verified live: 631-530-4426 now returns its OWN 13 reports (incl. a Spanish
'Es policía' law-enforcement flag) → 4/100 F denied, instead of echoing 516's data.
20 tests pass (added stale-abort guard test).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 11:49:52 -04:00
Natalie
263cc18aa1 feat(rating): full-history capture + multi-axis SDK rating profile
Replace the brittle keyword verdict with an LLM-consolidated rating profile per
caller, and capture the COMPLETE report history instead of the first screen.

- open_report_detail(): land on the caller detail page (taps the Recent-lookups
  row when the number was searched before) — fixes the 0-reports regression
- expand_all_reports() + capture_full_history(): tap "View all N", scroll-capture
  every page until the UI dump stops changing; merge_reports() dedupes across pages
- build_rating_profile() (batch SDK, sonnet): 0-100 score + A–F grade + per-axis
  sub-scores (reliability/payment/respect/safety) + signals + nuanced_notes.
  Domain nuance: deposit mentions weight POSITIVE; law-enforcement forces denied
- result_from_profile(): honors recommendation, score fallback, hard safety override
- decide_result(): kept as deterministic fallback, fixed to never approve over a
  model 'denied' / red flag and to match punctuation variants (no-show == no show)
- save_history(): persist full consolidated history + profile per caller
- tests: 18/18 (mapping, dedupe, safety override, full flow); DESIGN.md updated

Verified live against the redroid droplet (45.55.191.82): 15166687821 → 3 reports
consolidated → 18/100 grade F → denied, with multi-axis breakdown.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 10:10:56 -04:00
Natalie
992680657c feat: extract Mr. Number screening into standalone supporting app
Mirror the mac-sync / net-tools pattern: own repo, HTTP-only coupling to the
platform. Two tiers — plum client (mr_lookup.py + console-tray) and the DO
redroid droplet (45.55.191.82) — plus a plum-local stdio MCP wrapping
mr_lookup.py --json.

- client/: lookup + vision + record, host-free unit tests (12/12), console tray
- mcp/: finished index.ts (mr_number_lookup tool), typechecks + boots
- cloud/: adb-keyboard droplet server; terraform reference (canonical IaC in uvlava)
- deploy/: install.sh (plum) + deploy-droplet.sh
- docs/archive/: first-attempt redroid post-mortem (the "complete failure" was
  attempt #1 on the stock-kernel box; this droplet is its working successor)

Platform retains the screening data model, prospect gate, and trigger queue;
this app couples only via POST /admin/screening/check (service token).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 09:06:51 -04:00