Strong gate (operator-authorized, local-only, fail-soft): saved AddressBook
contacts = existing relationships/friends/vendors, excluded from the cold-
prospect corpus (the matcher's 'unknown numbers only' rule). Removes 189/1631
(11%) known contacts vs the proxy's 68 (4%). Combined cold_prospect_handles =
new-contact AND not-saved -> 1390 candidates (85%); the semantic not-a-prospect
classes in the re-sweep clean the remaining unsaved existing-clients/banter.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Data->model lane (mine). cold_prospect_handles(): handles whose first-ever
message is in the work era (Nov 1+) = new contacts, not pre-existing
relationships. sweep.py gets COLD_ONLY (default on). Honest scope: this cheap
CPU layer removes only ~4% (68/1631 — the pre-work relationships); the bulk of
contamination (in-work-era existing-clients/friends) needs the stronger gates:
the AddressBook known/unknown signal (operator OK) + the semantic
not-a-prospect classes in the re-sweep. This is layer 1 of that stack.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Real convos aren't clean alternating turns: ~38% of message-runs are bursts
(one sender, up to 132 in a row), and 5 group chats mix senders under
is_from_me=0. New lib.py collapses bursts into turns, excludes group chats
(chat.style=45 only), and yields CLIENT->QUINN decision points with a
per-conversation cap (avoids verbose threads flooding the set). Corrected
corpus: 1623 1:1 work-era conversations, 16095 decision points (8129 at
max_per_handle=20). sweep.py now uses lib + WORKERS for vertical scaling.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>