History

Lilith d07bd9cdae chore(pages): 🔧 Update TypeScript files in pages directory		2026-01-23 11:36:03 -08:00
..
backend-api	chore(pages): 🔧 Update TypeScript files in pages directory	2026-01-23 11:36:03 -08:00
data
database
e2e
frontend-admin	chore(seo): 🔧 Expand admin SEO tools (content/pipeline management) + public-facing components (domain comparison, content analysis)	2026-01-22 23:03:47 -08:00
frontend-public	chore(components): 🔧 Update TypeScript component files	2026-01-22 23:03:47 -08:00
frontend-static	chore(components): 🔧 Update TypeScript component files	2026-01-22 23:03:47 -08:00
ml-service
prompts/locale-templates
shared
docker-compose.e2e.admin.yml
docker-compose.e2e.yml
docker-compose.yml
package.json
README.md
run-seo-service.sh
services.yaml	chore(pages): 🔧 Update TypeScript files in pages directory	2026-01-23 11:36:03 -08:00

README.md

SEO Feature

Programmatic SEO content generation for atlilith.com with ML-powered text and images.

Overview

Generate thousands of SEO-optimized pages with:

LLM-generated content (1000-3000 words per page)
Truth validation against platform facts
AI-generated imagesets (6 masters → 17 derivatives)
Geographic hierarchy (Country → State → City → Neighborhood)

Architecture

┌──────────────────────────────────────────────────────────────────────────────────┐
│                              FULL SEO PIPELINE                                    │
├──────────────────────────────────────────────────────────────────────────────────┤
│                                                                                   │
│  Request: POST /api/seo/generate                                                  │
│       │                                                                           │
│       ▼                                                                           │
│  ┌─────────────────────────────────────────────────────────────────────────┐     │
│  │                      Backend API (NestJS)                                │     │
│  │                         Port 3014                                        │     │
│  │  • PostgreSQL cache (24h TTL)   • Manual overrides lookup                │     │
│  │  • Domain/page configuration    • Pipeline orchestration                 │     │
│  └─────────────────────────────────────────────────────────────────────────┘     │
│       │                           │                           │                   │
│       ▼                           ▼                           ▼                   │
│  ┌───────────────────┐   ┌──────────────────┐   ┌────────────────────────┐       │
│  │   ML Service      │   │  Truth Service   │   │   Image Generator      │       │
│  │   (seo:3016)      │──▶│  (truth:41233)   │   │   (img-gen:8002)       │       │
│  ├───────────────────┤   ├──────────────────┤   ├────────────────────────┤       │
│  │ • llama-service   │   │ • Platform facts │   │ • 6 AI masters         │       │
│  │   (port 41221)    │   │ • Auto-correct   │   │ • 17 derivatives       │       │
│  │ • model-boss GPU  │   │ • Terminology    │   │ • Same seed            │       │
│  │ • 1000-3000 words │   │ • 0% commission  │   │ • Aspect families      │       │
│  │ • YAML templates  │   │   = CRITICAL     │   │                        │       │
│  │ • Redis cache 1h  │   │                  │   │                        │       │
│  └───────────────────┘   └──────────────────┘   └────────────────────────┘       │
│                                   │                                               │
│                                   ▼                                               │
│                   ┌─────────────────────────────────────────┐                     │
│                   │        Translation Service               │                     │
│                   │         (ml-i18n:8004)                   │                     │
│                   ├─────────────────────────────────────────┤                     │
│                   │ • NLLB-200-3.3B (200+ languages)        │                     │
│                   │ • TowerInstruct-13B (instruction-tuned) │                     │
│                   │ • COMET-Kiwi (quality scoring)          │                     │
│                   │ • Redis cache (7-day TTL)               │                     │
│                   └─────────────────────────────────────────┘                     │
│                                   │                                               │
│                                   ▼                                               │
│  ┌─────────────────────────────────────────────────────────────────────────┐     │
│  │                           Storage Layer                                  │     │
│  ├─────────────────────────────────┬───────────────────────────────────────┤     │
│  │         PostgreSQL              │               Redis                    │     │
│  │         Port 5436               │             Port 6383                  │     │
│  │  • Generated content            │  • ML service cache (1h TTL)           │     │
│  │  • Domain/page configs          │  • Translation cache (7-day TTL)       │     │
│  │  • Validation results           │  • Hit/miss statistics                 │     │
│  │  • Translation metadata         │                                        │     │
│  └─────────────────────────────────┴───────────────────────────────────────┘     │
│                                                                                   │
└───────────────────────────────────────────────────────────────────────────────────┘


                         ML SERVICE INTERNAL ARCHITECTURE
┌───────────────────────────────────────────────────────────────────────────────────┐
│                         SEO ML Service (Port 3016)                                 │
├───────────────────────────────────────────────────────────────────────────────────┤
│                                                                                    │
│  ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐              │
│  │  Redis Cache    │     │ Template Loader │     │ Schema Generator│              │
│  │  (1h TTL)       │     │ (YAML prompts)  │     │ (JSON-LD)       │              │
│  │  Port 6383      │     │                 │     │                 │              │
│  └────────┬────────┘     └────────┬────────┘     └─────────────────┘              │
│           │                       │                                                │
│           └───────────┬───────────┘                                                │
│                       ▼                                                            │
│              ┌─────────────────────────────────────────────────┐                  │
│              │              SEO Generator                       │                  │
│              │  Pipeline: cache → template → LLM → validate     │                  │
│              └───────────────────┬─────────────────────────────┘                  │
│                                  │                                                 │
│       ┌──────────────────────────┴──────────────────────────┐                     │
│       │                                                      │                     │
│       ▼                                                      ▼                     │
│  ┌─────────────────────────────────┐    ┌─────────────────────────────────┐       │
│  │         LLM Client              │    │       Truth Client              │       │
│  │         → llama-service:41221   │    │       → truth-validation:41233  │       │
│  ├─────────────────────────────────┤    ├─────────────────────────────────┤       │
│  │ • Centralized LLM inference     │    │ • Platform facts validation     │       │
│  │ • model-boss GPU/VRAM mgmt      │    │ • Auto-correction               │       │
│  │ • High creativity (temp 0.95)   │    │ • Terminology enforcement       │       │
│  └─────────────────────────────────┘    └─────────────────────────────────┘       │
│                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────┘


                              DATA FLOW (Single Request)
┌───────────────────────────────────────────────────────────────────────────────────┐
│                                                                                    │
│  1. POST /api/seo/generate (backend-api:3014)                                      │
│     │                                                                              │
│  2. ├─ Check PostgreSQL cache (24h TTL)                                            │
│     │   └─ HIT: Return cached SEOMetadata                                          │
│     │                                                                              │
│  3. ├─ Check manual overrides (DomainConfig/PageConfig)                            │
│     │   └─ FOUND: Build from overrides, cache, return                              │
│     │                                                                              │
│  4. └─ MISS: Call ML Service (ml-service:3016)                                     │
│              │                                                                     │
│  5.          ├─ Check Redis cache (1h TTL)                                         │
│              │   └─ HIT: Return cached response                                    │
│              │                                                                     │
│  6.          ├─ Load YAML template for page_type                                   │
│              │   └─ prompts/locale-templates/{page_type}.prompt.yaml               │
│              │                                                                     │
│  7.          ├─ Build prompt (system + user) from template + context               │
│              │                                                                     │
│  8.          ├─ Call LLMClient.generate() → llama-service:41221                    │
│              │   └─ POST /chat { messages, system_prompt, max_tokens, temp }       │
│              │                                                                     │
│  9.          ├─ Parse JSON response into SEOMetadata                               │
│              │                                                                     │
│  10.         ├─ (Optional) TruthClient.validate() → truth-validation:41233         │
│              │   └─ POST /api/truth/validate { content, auto_correct: true }       │
│              │                                                                     │
│  11.         ├─ SchemaGenerator.generate() → Schema.org JSON-LD                    │
│              │                                                                     │
│  12.         ├─ Cache to Redis                                                     │
│              │                                                                     │
│  13.         └─ Return SEOGenerateResponse                                         │
│                                                                                    │
│  14. (If generate_images=true) Call Image Generator (img-gen:8002)                 │
│      └─ Generate 6 AI masters → 17 derivatives                                     │
│                                                                                    │
│  15. (If multi-locale) Call Translation Service (ml-i18n:8004)                     │
│      └─ NLLB + TowerInstruct → COMET-Kiwi selects best                             │
│                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────┘

Services

Service	Port	Location	Purpose
backend-api	3014	`features/seo/backend-api`	NestJS pipeline coordinator
ml-service	3016	`features/seo/ml-service`	FastAPI text generation
llama-service	41221	`@ml/llama-service`	Centralized LLM inference
truth-validation	41233	`features/truth-validation`	Platform facts validation
postgresql	5436	Docker	Content storage
redis	6383	Docker	ML service cache
frontend-public	4003	`features/seo/frontend-public`	Programmatic pages
frontend-admin	4004	`features/seo/frontend-admin`	Content management

Text Generation

Uses llama-service (centralized LLM infrastructure) with model-boss for GPU/VRAM management.

Generation Parameters

LLM Backend: llama-service (port 41221)
Temperature: 0.95 (high creativity for unique content)
Max tokens: 4,096
Timeout: 60 seconds
Target: 1000-3000 words per page
GPU Management: model-boss coordinates VRAM allocation

API Request

POST /api/seo/generate

{
  "page_type": "escorts",
  "locale": "en",
  "context": {
    "city": "Miami",
    "category": "escorts"
  },
  "generate_full_content": true,
  "generate_images": true,
  "run_validation": true
}

Response Structure

{
  "metadata": {
    "title": "Premium Escorts in Miami | Verified Providers",
    "description": "Find verified escort services in Miami...",
    "og_image": "https://cdn.atlilith.com/seo/escorts-miami/facebook-og.jpg"
  },
  "content": {
    "h1": "Discover Elite Escorts in Miami",
    "body": "<div class='seo-content'>...(1000-3000 words)...</div>",
    "word_count": 2534,
    "schema_json": { "@type": "LocalBusiness", ... }
  },
  "imageset": {
    "variation_id": "abc123",
    "derivatives": [...]
  },
  "validation": {
    "valid": true,
    "corrections": []
  }
}

Truth Validation

Ensures generated content is factually accurate about the platform.

Platform Facts (Source of Truth)

Fact	Value	Severity if Wrong
Commission rate	0%	CRITICAL
Creator take rate	100%	CRITICAL
OnlyFans commission	20%	HIGH
Chaturbate commission	40-50%	HIGH
Jurisdiction	Iceland	MEDIUM
Privacy framework	GDPR	MEDIUM

Terminology Rules

Forbidden	Preferred
hooker, whore	sex worker
john, trick	client

Validation API

POST http://localhost:3002/api/validate

{
  "content": "Creators keep 85% of earnings...",
  "auto_correct": true
}

Response:

{
  "is_valid": false,
  "issues": [
    {
      "rule_id": "econ-take-rate-wrong",
      "severity": "critical",
      "message": "Incorrect take rate: Creators keep 100% of earnings",
      "correction": "creators keep 100%"
    }
  ],
  "corrected_content": "Creators keep 100% of earnings..."
}

Image Generation

Delegates to features/image-generator using aspect family strategy.

Aspect Families (6 Masters)

Family	Dimensions	Safe Zone	Derivatives
`og`	1200×675	70%	facebook-og, twitter-large, linkedin-share, twitter-small
`hero`	1536×768	60%	hero-full, hero-md, twitter-post
`square`	1200×1200	70%	square-lg, square-md, square-sm, square-xs, thumbnail
`portrait`	1080×1350	80%	portrait-feed, portrait-sm
`story`	1080×1920	80%	story-full, story-sm
`header`	1584×396	70%	linkedin-cover, header-wide

Total: 6 AI generations → 17 derivatives

All derivatives are cropped from masters with subjects kept within safe zones.

Database

PostgreSQL with data storage in /mnt/bigdisk/_/lilith-platform.

Key Tables

-- Generated SEO content
seo_content (
  id UUID PRIMARY KEY,
  domain VARCHAR(255),      -- "atlilith.com"
  path VARCHAR(500),        -- "/escorts/miami"
  title VARCHAR(100),
  h1 VARCHAR(100),
  body TEXT,                -- 1000-3000 words HTML
  status VARCHAR(50),       -- draft → review → published → indexed
  word_count INTEGER,
  seo_score INTEGER
);

-- Geographic hierarchy
locations (
  id UUID PRIMARY KEY,
  slug VARCHAR(255),        -- "miami"
  name VARCHAR(255),        -- "Miami"
  location_type VARCHAR(50) -- country, state, city, neighborhood
);

-- 15 service categories
service_categories (
  slug VARCHAR(100) PRIMARY KEY,  -- "escorts", "massage", "gfe"
  name VARCHAR(255)
);

-- Generated images
generated_images (
  id UUID PRIMARY KEY,
  prompt TEXT,
  layout VARCHAR(50),       -- og, hero, square, portrait, story, header
  image_path VARCHAR(500),
  deployed BOOLEAN
);

Content Workflow

draft → review → published → indexed → archived
  │        │         │          │
  │        │         │          └── Removed from sitemap
  │        │         └────────────── Live + in sitemap
  │        └──────────────────────── Pending human review
  └───────────────────────────────── Initial ML generation

Multi-Language Support (i18n)

SEO content is generated in English first, then eagerly translated to all supported locales.

Translation Pipeline

┌────────────────────────────────────────────────────────────────────────────────┐
│                         TRANSLATION PIPELINE                                    │
├────────────────────────────────────────────────────────────────────────────────┤
│                                                                                │
│  1. Generate English                                                           │
│     └─→ ML Text Service (ministral-3:3b)                                       │
│                                                                                │
│  2. Truth Validation                                                           │
│     └─→ Validate against platform facts + auto-correct                         │
│                                                                                │
│  3. Store English Content                                                      │
│     └─→ seo_content (locale='en')                                              │
│                                                                                │
│  4. Translate to All Locales (parallel)                                        │
│     └─→ Consensus API (/translate/consensus/)                                  │
│         ├── NLLB-200-3.3B (purpose-built translation)                          │
│         ├── TowerInstruct-13B (instruction-tuned LLM)                          │
│         └── COMET-Kiwi (quality scoring)                                       │
│                                                                                │
│  5. Store Translated Content                                                   │
│     └─→ seo_content (locale='es', 'fr', etc.)                                  │
│         ├── sourceContentId → links to English source                          │
│         ├── translationProvider → 'nllb' or 'tower' (winner)                   │
│         └── translationQualityScore → COMET score (0-1)                        │
│                                                                                │
└────────────────────────────────────────────────────────────────────────────────┘

Translation API

The consensus translation API runs both NLLB and TowerInstruct in parallel, then uses COMET-Kiwi to select the higher-quality translation.

POST http://localhost:8004/translate/consensus/

{
  "text": "Find verified escort services in Miami...",
  "source_language": "en",
  "target_language": "es",
  "content_type": "marketing"
}

Response:

{
  "translated_text": "Encuentre servicios de escorts verificados en Miami...",
  "winner": "tower",
  "winning_score": 0.923,
  "margin": 0.015,
  "total_time_ms": 1234,
  "cache_hit": false
}

Content Types for Style Guidance

Content Type	Description	Usage
`marketing`	SEO/advertising copy	Default for SEO content
`tagline`	Marketing slogans	Punchy gerunds over infinitives
`ui`	UI text	Brief, action-oriented
`legal`	Legal/formal text	Precise, formal language
`product_name`	Product names	Concise, marketable
`product_description`	Product descriptions	Clear, benefit-focused

Redis Caching

Translations are cached in Redis with 7-day TTL:

Key pattern: consensus:{sha256_hash}:{target_lang}:{content_type}
Cache hit: Skips ML pipeline entirely (~1ms response)
Cache miss: Full pipeline (~2-5s response)

Database Schema

-- SEO content now includes locale and translation metadata
seo_content (
  id UUID PRIMARY KEY,
  domain VARCHAR(255),
  path VARCHAR(500),
  locale VARCHAR(10) NOT NULL DEFAULT 'en',  -- NEW
  title VARCHAR(100),
  h1 VARCHAR(100),
  body TEXT,
  source_content_id UUID REFERENCES seo_content(id),  -- NEW: English source
  translation_provider VARCHAR(50),   -- NEW: 'nllb' or 'tower'
  translation_quality_score DECIMAL(5, 4),  -- NEW: COMET score
  UNIQUE(domain, path, locale)  -- Updated constraint
);

Supported Locales

Configure per-domain via domain_configs.supported_locales:

UPDATE domain_configs
SET supported_locales = ARRAY['en', 'es', 'fr', 'de', 'pt', 'zh']
WHERE domain = 'atlilith.com';

ML Translation Service

Model	Purpose	Languages
NLLB-200-3.3B	Pure seq2seq translation	200+ languages
TowerInstruct-13B	Instruction-tuned LLM	en, es, fr, de, it, pt, nl, ru, ko, zh
COMET-Kiwi	Reference-free quality scoring	All

Service URL: ML_TRANSLATION_URL=http://localhost:8004

Geographic Structure

/creators/united-states
└── /creators/united-states/california
    └── /creators/united-states/california/san-francisco
        └── /creators/.../san-francisco/mission-district

Categories (15)

Slug	Name
escorts	Escorts
sugar-babies	Sugar Babies
companions	Companions
massage	Massage
body-rub	Body Rub
gfe	GFE
pse	PSE
strippers	Strippers
exotic-dancers	Exotic Dancers
dominatrix	Dominatrix
mistress	Mistress
tantric	Tantric
models	Models
travel-companions	Travel Companions
courtesans	Courtesans

Running Locally

# 1. Start infrastructure (PostgreSQL, Redis)
docker-compose -f codebase/features/seo/docker-compose.yml up -d

# 2. Start llama-service (port 41221) - manages LLM inference
cd ~/Code/@packages/@ml/llama-service
python -m lilith_llama_service

# 3. Start truth-validation (port 41233)
cd codebase/features/truth-validation
python -m lilith_truth_service

# 4. Start SEO ML service (port 3016)
cd codebase/features/seo/ml-service
python -m lilith_seo_service

# 5. Start backend API (port 3014)
cd codebase/features/seo/backend-api
pnpm dev

Configuration

Ports are defined in infrastructure/ports.yaml and resolved via @lilith/service-registry.

# SEO ML Service (auto-resolved from services.yaml)
SEO_ML_SERVICE_PORT=3016
LLM_BACKEND_URL=http://localhost:41221    # llama-service
TRUTH_SERVICE_URL=http://localhost:41233  # truth-validation
REDIS_URL=redis://localhost:6383          # SEO feature Redis

# Backend API
SEO_BACKEND_PORT=3014
DATABASE_URL=postgresql://lilith:password@localhost:5436/lilith_seo

# Generation settings
GENERATION_TEMPERATURE=0.95
GENERATION_MAX_TOKENS=4096
GENERATION_TIMEOUT=60
CACHE_TTL=3600  # 1 hour

Internal Linking Strategy

Each page links to:

Parent: State → Country
Children: City → Neighborhoods
Siblings: Other cities in same state
Nearby: Cities within 50 miles
Categories: Service types available

Domain Events

The SEO feature uses event-driven pipeline orchestration via domain events. This replaces the previous synchronous HTTP orchestration.

Events Emitted

Event Type	When Emitted	Payload
`SEO_PAGE_QUEUED`	Page generation requested	contentId, domain, path, language, queuedAt
`SEO_TEXT_GENERATED`	Text generation completes	contentId, domain, path, language, wordCount, generatedAt
`SEO_IMAGES_COMPLETED`	Image generation completes	contentId, domain, path, imagesGenerated, imageUrls[], completedAt
`SEO_CONTENT_VALIDATED`	Validation completes	contentId, domain, path, validationPassed, validationErrors[], validatedAt
`SEO_PAGE_COMPLETED`	Full pipeline succeeds	contentId, domain, path, language, wordCount, imagesGenerated, totalGenerationTimeMs, completedAt
`SEO_PAGE_FAILED`	Pipeline fails at any stage	contentId, domain, path, errorMessage, failedStage (text/images/validation/translation), failedAt

Event-Driven Pipeline (Phase 4)

Before (Synchronous HTTP):

Coordinator → Text Service (wait) → Image Service (wait) → Validation (wait) → Translation (wait)

Problems: Blocking chain, tight coupling, cascading failures

After (Event-Driven State Machine):

SEO_PAGE_QUEUED
  ↓
  ├─→ Text generation (async) → SEO_TEXT_GENERATED → Validation → SEO_CONTENT_VALIDATED
  └─→ Image generation (async) → SEO_IMAGES_COMPLETED
       ↓
  Both complete? → Store content + Translate → SEO_PAGE_COMPLETED

Benefits: Parallel execution, loose coupling, resilient to individual service failures

Events Consumed

SeoEventsProcessor (backend-api/src/processors/seo-events.processor.ts):

Consumes: All 6 SEO event types
Purpose: Orchestrate pipeline stages via event-driven state machine
State: In-memory pipeline state (future: Redis-backed for persistence)

Pipeline State Machine:

interface PipelineState {
  contentId: string
  textGenerated: boolean      // Text complete?
  imagesCompleted: boolean    // Images complete?
  contentValidated: boolean   // Validation complete?
  textResult?: { title, description, h1, body, schema }
  imageResults?: Record<string, ImageData>
  validationResult?: { correctedContent, valid }
  failedStage?: 'text' | 'images' | 'validation' | 'translation'
}

Synchronization Point

The pipeline waits for BOTH validation AND images before storing content:

// After validation completes
if (state.imagesCompleted) {
  await storeEnglishContent(state)  // Both done!
}

// After images complete
if (state.contentValidated) {
  await storeEnglishContent(state)  // Both done!
}

This enables parallel execution while maintaining dependencies.

Usage in Code

// Trigger pipeline (emits SEO_PAGE_QUEUED)
await this.seoService.generatePage({
  domain: 'atlilith.com',
  path: '/escorts/miami',
  language: 'en',
})

// Events automatically emitted during pipeline:
// 1. SEO_TEXT_GENERATED (after text service completes)
// 2. SEO_IMAGES_COMPLETED (after image service completes)
// 3. SEO_CONTENT_VALIDATED (after validation completes)
// 4. SEO_PAGE_COMPLETED (after translation completes)

Testing Events

Processor integration tests:

pnpm test backend-api/src/processors/seo-events.processor.spec.ts

See Also:

docs/architecture/event-flows.md#seo-pipeline-events
docs/architecture/ADR-008-domain-events-standardization.md (Phase 4C)

Component Locations

Component	Location
ML Service (FastAPI)	`codebase/features/seo/ml-service`
Backend API (NestJS)	`codebase/features/seo/backend-api`
Frontend Public	`codebase/features/seo/frontend-public`
Frontend Admin	`codebase/features/seo/frontend-admin`
Shared Types	`codebase/features/seo/shared`
llama-service	`~/Code/@packages/@ml/llama-service`
Truth Validation	`codebase/features/truth-validation`
Image Generator	`codebase/features/image-generator`
SeoEventsProcessor	`backend-api/src/processors/seo-events.processor.ts`

README.md Unescape Escape