Wire the on-box (Claude-API-less) path decided with the operator: EXTRACT_BACKEND=ocr
sends each screenshot to the on-box mrnumber-ocr service (raw text, no per-shot
structuring); build_rating_profile uses an OpenAI-compatible LLM on a DO GPU droplet
(RATING_LLM_URL) which extracts the reports from the raw OCR text AND produces the
multi-axis verdict. Reports are folded back into the history so the people-signal +
counts + safety flags reflect them; safety detection also scans the raw OCR lines so a
LE term forces cop_flag even before structuring. vision/Claude stays the plum-dev
default. +5 tests incl. full OCR→GPU→cop_flag flow. 33/33.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop all old-Quinn coupling from the client path: instead of POSTing to
${QUINN_MY_URL}/api/clients/{id}/screening, record the verdict as a
`screening_mrnumber` person signal in the cocotte people service (persons DB),
keyed by the phone number (channel 'sms'; person auto-upserted). Verdict maps to
the bare valueText consumers switch on — law-enforcement critical flag → `cop_flag`,
else denied/approved/error; pending/not_found → unset (read as not_screened). Rich
record rides valueJsonb. `--client-id` → optional `--ref` (correlation id in
sourceHandle); env QUINN_MY_* → PEOPLE_*. 28/28 unit tests (wire body asserted).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The rating LLM folded the single most dangerous signal — a law-enforcement
sting ("Es policía") — into row 7 of a flat 14-item red_flags list and could
under-weight it. Add a deterministic, accent/language-folded taxonomy that
detects critical-safety categories straight from the report text, promotes them
to a top-level `safety_flags` array (icon + severity + evidence), surfaces them
ABOVE the rating in the log + wire body + MCP json, and applies a hard 'denied'
override that does not depend on the model scoring the safety axis correctly.
Folding matches Spanish "Es policía"; word boundaries avoid 'problem'/'copy'
false positives. +5 unit tests; verified live against the box (6315304426).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- client/mr_lookup.py: MrNumberEmulator subclasses RedroidDevice; adb base, vision
harness, and screening recorder now come from redroid_client (lilith-redroid-client).
Only Mr. Number nav + rating profile remain. 19 tests green.
- mcp/index.ts: thin call to @lilith/redroid-mcp factory (logger.ts removed — shared).
- Remove cloud/ + deploy/deploy-droplet.sh: the redroid box is now owned by the
@redroid app. install.sh pip-installs lilith-redroid-client from cocotte-forge PyPI.
- Manifest/CLAUDE: box ownership → @redroid; plum per-app console tunnel stays here.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A fresh lookup was silently re-reading the PREVIOUS caller's page when the app
was left on an open detail/list. Two guards:
- go_to_search(): press BACK until the search field is present (relaunch as last
resort) so every lookup starts from a clean search screen
- open_report_detail()/detail_state(): confirm we're on a report detail reached
after that fresh search; abort with result='error' rather than rate a stale page
(the number isn't always printed on the page — 'Personal Line' with no digits —
so we trust 'detail page after clean search', and match digits when shown)
Verified live: 631-530-4426 now returns its OWN 13 reports (incl. a Spanish
'Es policía' law-enforcement flag) → 4/100 F denied, instead of echoing 516's data.
20 tests pass (added stale-abort guard test).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the brittle keyword verdict with an LLM-consolidated rating profile per
caller, and capture the COMPLETE report history instead of the first screen.
- open_report_detail(): land on the caller detail page (taps the Recent-lookups
row when the number was searched before) — fixes the 0-reports regression
- expand_all_reports() + capture_full_history(): tap "View all N", scroll-capture
every page until the UI dump stops changing; merge_reports() dedupes across pages
- build_rating_profile() (batch SDK, sonnet): 0-100 score + A–F grade + per-axis
sub-scores (reliability/payment/respect/safety) + signals + nuanced_notes.
Domain nuance: deposit mentions weight POSITIVE; law-enforcement forces denied
- result_from_profile(): honors recommendation, score fallback, hard safety override
- decide_result(): kept as deterministic fallback, fixed to never approve over a
model 'denied' / red flag and to match punctuation variants (no-show == no show)
- save_history(): persist full consolidated history + profile per caller
- tests: 18/18 (mapping, dedupe, safety override, full flow); DESIGN.md updated
Verified live against the redroid droplet (45.55.191.82): 15166687821 → 3 reports
consolidated → 18/100 grade F → denied, with multi-axis breakdown.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mirror the mac-sync / net-tools pattern: own repo, HTTP-only coupling to the
platform. Two tiers — plum client (mr_lookup.py + console-tray) and the DO
redroid droplet (45.55.191.82) — plus a plum-local stdio MCP wrapping
mr_lookup.py --json.
- client/: lookup + vision + record, host-free unit tests (12/12), console tray
- mcp/: finished index.ts (mr_number_lookup tool), typechecks + boots
- cloud/: adb-keyboard droplet server; terraform reference (canonical IaC in uvlava)
- deploy/: install.sh (plum) + deploy-droplet.sh
- docs/archive/: first-attempt redroid post-mortem (the "complete failure" was
attempt #1 on the stock-kernel box; this droplet is its working successor)
Platform retains the screening data model, prospect gate, and trigger queue;
this app couples only via POST /admin/screening/check (service token).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>