Some checks are pending
CI / verify (push) Waiting to run
Validated OSS (Qwen3.6-27B-AEON-Uncensored) Quinn-voice drafting against the agent-matcher reply-queue baseline. Four methodology fixes eliminate the early weaknesses: json_schema strict (0% malformed), canon few-shot (100% on-voice), current-facts/location-from-context (0 location errors), and classify-move-first then reply (matcher-level discipline on defensive moves: withhold address, redirect harvesters+crude to OF). PII stays under gitignored .data/; scripts only. Claude is the offline judge/advisor, never the runtime generator. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
32 lines
1.4 KiB
Python
32 lines
1.4 KiB
Python
#!/usr/bin/env python3
|
|
"""Score results.json: malformed %, on-voice %, and move-agreement vs the matcher.
|
|
|
|
The agent-matcher's tmpl is the move baseline; we report how often the OSS model's
|
|
classified move agrees, and surface disagreements for review. No PII to stdout
|
|
beyond the client's last line (needed to judge) — keep this terminal local.
|
|
Env: DATA_DIR (default ./.data).
|
|
"""
|
|
import json, os
|
|
|
|
DATA = os.environ.get("DATA_DIR", os.path.join(os.path.dirname(__file__), ".data"))
|
|
r = json.load(open(os.path.join(DATA, "results.json")))
|
|
|
|
# matcher tmpl -> our move vocabulary
|
|
FAM = {"opener": "opener", "opener-q": "opener", "opener-pink": "opener",
|
|
"subhour": "subhour", "address": "address", "napa": "out-of-area", "of": "of"}
|
|
|
|
def voiced(s):
|
|
return any(w in s.lower() for w in ["hun", "babe", "💗", "😘", "🥰"])
|
|
|
|
n = len(r)
|
|
malformed = sum(1 for x in r if not x["oss_reply"])
|
|
on_voice = sum(voiced(x["oss_reply"]) for x in r)
|
|
move_match = sum(1 for x in r if FAM.get(x.get("tmpl")) == x.get("oss_move"))
|
|
print(f"n={n} malformed={malformed} ({100*malformed//n}%) "
|
|
f"on-voice={on_voice}/{n} move-agrees-matcher={move_match}/{n}")
|
|
|
|
dis = [x for x in r if FAM.get(x.get("tmpl")) != x.get("oss_move")]
|
|
if dis:
|
|
print("\nmove disagreements (matcher_tmpl -> oss_move):")
|
|
for x in dis:
|
|
print(f" [{x['id']}] {x['tmpl']} -> {x.get('oss_move')} | client: {x['their_last'][:60]}")
|