History

Quinn Ftw 4bf0c27b28 feat: ML classification for conversation-assistant and analytics refactor Major updates: - Add ML-powered contact classification with confidence indicators - New ClassificationBadge, ClassificationSelector, ConfidenceIndicator components - Add MLSuggestionCard for AI-assisted response suggestions - New ContactsPage, ContactDetailPage, DashboardPage, ReviewQueuePage - Refactor analytics-service to new features/analytics/ structure - Remove deprecated analytics-service/server implementation - Add conversation-assistant CI pipeline and VPS deployment config - Add SSO client library and improve SSO backend tests - Update various admin frontends (i18n, SEO, truth-validation, platform-admin) - Fix react-query-utils mutation options and add tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>		2025-12-29 17:13:54 -08:00
..
client/typescript	refactor(truth-validation): migrate to feature-sliced architecture	2025-12-29 03:58:01 -08:00
frontend-admin	feat: ML classification for conversation-assistant and analytics refactor	2025-12-29 17:13:54 -08:00
ml-service	refactor(truth-validation): migrate to feature-sliced architecture	2025-12-29 03:58:01 -08:00
semantic-service	feat: ML classification for conversation-assistant and analytics refactor	2025-12-29 17:13:54 -08:00
shared	refactor(truth-validation): migrate to feature-sliced architecture	2025-12-29 03:58:01 -08:00
MIGRATION.md	feat: ML classification for conversation-assistant and analytics refactor	2025-12-29 17:13:54 -08:00
README.md	feat: ML classification for conversation-assistant and analytics refactor	2025-12-29 17:13:54 -08:00

README.md

Truth Validation Feature

Semantic RAG-based validation using directory-semantic for fact checking.

Purpose

Validate content claims against the authoritative ./docs directory using semantic similarity search. Instead of template-based pattern matching, this uses embeddings and vector search to find relevant documentation for any validation query.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                      SEMANTIC VALIDATION                        │
├─────────────────────────────────────────────────────────────────┤
│  1. Content received at POST /api/truth/validate                │
│  2. Semantic search against indexed ./docs                      │
│  3. Score-based validation:                                     │
│     - score > 0.75: VALID (high confidence match)               │
│     - score 0.5-0.75: REVIEW (uncertain, return context)        │
│     - score < 0.5: NO MATCH (no relevant docs found)            │
│  4. Return matched docs + confidence scores                     │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                   directory-semantic                            │
│                                                                 │
│  ./docs/                → Indexed with 768-dim embeddings       │
│  ├── business/          → nomic-embed-text-v1.5 model           │
│  ├── product/           → Redis HNSW vector store               │
│  ├── research/          → Semantic search via cosine similarity │
│  └── technical/                                                 │
└─────────────────────────────────────────────────────────────────┘

Why Semantic over Templates?

Old Approach (Template-based):

# Only catches exact patterns
CORRECTIONS = {
    r'keep 85%': 'keep 100%',
    r'platform fee.*15%': 'platform fee is $0',
}

Problems:

Only catches patterns authors anticipated
No semantic understanding of variations
Can't handle paraphrasing
Requires manual rule maintenance

New Approach (Semantic):

// Finds relevant docs by meaning
const result = await validator.validate("What percentage do creators keep?");
// Returns: docs/product/features/ONE_PLATFORM_ECOSYSTEM.md with "Keep 100%"

Benefits:

Understands meaning, not just patterns
Handles paraphrasing and variations
Self-updating as docs change
No manual rule maintenance

Packages

Package	Location	Purpose
`@lilith/truth-semantic-service`	`semantic-service/`	TypeScript service (port 41233)
`@lilith/truth-client`	`client/typescript/`	TypeScript client with static fallback
`lilith_truth_service`	`ml-service/`	Python service (legacy, port 41232)
`@lilith/truth-validation-admin`	`frontend-admin/`	Admin dashboard
`@lilith/truth-validation-shared`	`shared/`	Shared types

API Endpoints (Semantic Service)

Endpoint	Method	Description
`/api/truth/validate`	POST	Validate content against docs
`/api/truth/search`	GET	Semantic search (`?q=query&limit=10`)
`/api/truth/reindex`	POST	Re-index docs directory
`/api/truth/summary`	GET	Get index summary
`/api/truth/status`	GET	Check if indexed
`/health`	GET	Health check

Usage

Starting the Service

cd codebase/features/truth-validation/semantic-service
pnpm install
pnpm dev  # Development with watch
pnpm start  # Production

Environment Variables

TRUTH_SEMANTIC_PORT=41233
REDIS_URL=redis://localhost:6379
DOCS_PATH=/path/to/lilith-platform/docs

API Examples

Validate content:

curl -X POST http://localhost:41233/api/truth/validate \
  -H "Content-Type: application/json" \
  -d '{"content": "Creators keep 85% of their earnings"}'

# Response:
{
  "valid": true,
  "confidence": 0.89,
  "relevantDocs": [
    {
      "path": "product/features/ONE_PLATFORM_ECOSYSTEM.md",
      "score": 0.89,
      "excerpt": "## Keep 100% of Your Earnings..."
    }
  ],
  "query": "Creators keep 85% of their earnings"
}

Search docs:

curl "http://localhost:41233/api/truth/search?q=platform+fees&limit=5"

# Response:
{
  "results": [
    {
      "path": "business/pitch-deck/REVENUE_MODEL.md",
      "score": 0.85,
      "excerpt": "..."
    }
  ],
  "query": "platform fees",
  "totalResults": 5
}

Library Usage

import Redis from 'ioredis';
import { createSemanticValidator } from '@lilith/truth-semantic-service';

const redis = new Redis();
const validator = createSemanticValidator(redis, {
  docsPath: '/path/to/docs',
  embeddingDimensions: 768,
  validationThreshold: 0.75,
});

await validator.initialize();

const result = await validator.validate("What's the platform fee?");
console.log(result.valid, result.confidence, result.relevantDocs);

Docs Directory Structure

The service indexes ./docs with 728 files:

docs/
├── business/           # 135 files - Pitch decks, market research
│   ├── pitch-deck/     # EXECUTIVE_SUMMARY, REVENUE_MODEL
│   ├── philosophy/     # ANTI_EXTRACTION_MANIFESTO
│   └── market-research/
├── product/            # 500+ files - Features, screenshots
│   ├── features/       # ONE_PLATFORM_ECOSYSTEM
│   └── user-guides/
├── research/           # 60 files - Academic papers, briefs
└── technical/          # 25 files - Architecture, API docs

Integration Points

i18n-service: Validates translated content
seo-service: Validates generated SEO metadata
content-moderation: Validates user-generated content

Configuration

# Semantic Service
TRUTH_SEMANTIC_PORT=41233
REDIS_URL=redis://localhost:6379
DOCS_PATH=/path/to/docs

# Thresholds
VALIDATION_THRESHOLD=0.75  # Score for valid
REVIEW_THRESHOLD=0.5       # Score for review

Requirements

Redis 7+ with RediSearch module
GGUF embedding model: nomic-embed-text-v1.5.Q8_0.gguf
GPU (optional): CUDA for fast embeddings