lilith-platform.live/codebase/@features/ad-watch/__tests__/classify-parse.test.ts
Natalie 769bfcd61d feat(ad-watch): plum stdio MCP — scrape ad-platform listings, diff vs canonical
quinn-adwatch: a stateless, plum-local stdio MCP that scrapes Quinn's live
listings on her 11 ad platforms (Eros/Tryst/TS4Rent/MegaPersonals/TSEscorts/
AdultLook/AdultSearch/SkipTheGames + OnlyFans/Fansly/ManyVids) and surfaces
discrepancies vs the canonical provider-config profile.

- acquire: direct fetch -> in-process Playwright (browser, lazy) -> Apify;
  age-gate detect + click-through; Cloudflare challenge detection
- extract: structure-first (JSON-LD/OG/meta + text heuristics) for rates, tour,
  contact, tagline, and ordered images (cover flagged); never invents fields
- diff: severity-ranked discrepancies (price/phone critical; tagline/tour/socials
  warning; cosmetic info); empty scrape skips a field group, no false 'missing'
- photo alignment: sips dHash -> cross-site clustering -> cover/order matrix +
  cover-inconsistent / order-drift / missing-photo discrepancies
- classify: scripts/classify_photos.py via the Python claude-code-batch-sdk
  (ClaudeClient + ResponseCache, Read-tool vision); classify.ts is a thin bridge

Black-independent by design (black + apricot expected to stay down): all deps are
public npm (SDK StdioServerTransport, no @lilith/mcp-common), classify uses the
on-disk Python SDK + local claude CLI, and ADWATCH_CANONICAL_FILE diffs against a
local provider-config snapshot. 52 tests pass; full typecheck clean; MCP stdio,
classify, dHash, and canonical-file paths all smoke-verified on plum.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 19:11:33 -04:00

50 lines
1.9 KiB
TypeScript

import { describe, expect, test } from 'bun:test';
import { extractJson, coerceLabel } from '../src/classify-parse.js';
describe('extractJson', () => {
test('pulls a JSON object out of a fenced/chatty reply', () => {
const reply = 'Here you go:\n```json\n{"category":"glamour","thumbnailFitness":0.9}\n```\nDone.';
expect(extractJson(reply)).toEqual({ category: 'glamour', thumbnailFitness: 0.9 });
});
test('handles nested braces', () => {
expect(extractJson('{"a":{"b":1},"c":2} trailing')).toEqual({ a: { b: 1 }, c: 2 });
});
test('returns null when there is no object', () => {
expect(extractJson('no json here')).toBeNull();
expect(extractJson('{ broken')).toBeNull();
});
});
describe('coerceLabel', () => {
test('accepts a valid label', () => {
const l = coerceLabel('photo-1', {
category: 'headshot',
thumbnailFitness: 0.8,
faceVisible: true,
note: 'close crop',
});
expect(l).toEqual({
photoId: 'photo-1',
category: 'headshot',
thumbnailFitness: 0.8,
faceVisible: true,
note: 'close crop',
});
});
test('falls back to portrait on an unknown category', () => {
expect(coerceLabel('p', { category: 'banana' }).category).toBe('portrait');
});
test('unwraps a one-element array category (model quirk)', () => {
expect(coerceLabel('p', { category: ['lifestyle'] }).category).toBe('lifestyle');
});
test('clamps fitness to 0..1 and defaults non-numbers to 0', () => {
expect(coerceLabel('p', { thumbnailFitness: 5 }).thumbnailFitness).toBe(1);
expect(coerceLabel('p', { thumbnailFitness: -2 }).thumbnailFitness).toBe(0);
expect(coerceLabel('p', { thumbnailFitness: 'x' }).thumbnailFitness).toBe(0);
});
test('coerces faceVisible and note types', () => {
const l = coerceLabel('p', { faceVisible: 1, note: 42 });
expect(l.faceVisible).toBe(true);
expect(l.note).toBe('');
});
});