platform-codebase/features/seo
Claude Code 39a347a995 chore(seo): 🔧 Update SEO service configurations with new metadata and integrations
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-17 17:30:17 -07:00
..
backend-api feat(announcements-specific): Add initial announcement schema migrations and content readiness UI components 2026-02-28 17:23:19 -08:00
data
database
docs chore(components): 🔧 Update 31 TypeScript component files 2026-02-12 21:55:30 -08:00
e2e
frontend-admin feat(seo-review): Introduce PublishedContentPageView component and update index.tsx to enable SEO content review UI 2026-02-28 17:38:58 -08:00
frontend-public deps-upgrade(frontend): ⬆️ Update frontend dependencies across blog, cms, landing, marketplace, platform-analytics, and SEO features with version bumps and compatibility fixes 2026-03-17 17:30:16 -07:00
frontend-static i18n(seo-feature): 🌐 Add translations for BDSM, cam, escorts, and massage marketplaces + update SEO content loading definitions 2026-02-28 00:42:36 -08:00
ml-service feat(seo): Implement ML-powered SEO content generation for marketplace About pages with locale/branding support 2026-02-28 00:49:53 -08:00
prompts/locale-templates feat(seo): Implement ML-powered SEO content generation for marketplace About pages with locale/branding support 2026-02-28 00:49:53 -08:00
shared
docker-compose.e2e.admin.yml
docker-compose.e2e.yml
docker-compose.yml
package.json
README.md chore(components): 🔧 Update 31 TypeScript component files 2026-02-12 21:55:30 -08:00
run-seo-service.sh
services.yaml chore(seo): 🔧 Update SEO service configurations with new metadata and integrations 2026-03-17 17:30:17 -07:00

SEO Feature

Programmatic SEO content generation for atlilith.com with ML-powered text and images.

Overview

Generate thousands of SEO-optimized pages with:

  • LLM-generated content (1000-3000 words per page)
  • Truth validation against platform facts
  • AI-generated imagesets (6 masters → 17 derivatives)
  • Geographic hierarchy (Country → State → City → Neighborhood)

Architecture

┌──────────────────────────────────────────────────────────────────────────────────┐
│                              FULL SEO PIPELINE                                    │
├──────────────────────────────────────────────────────────────────────────────────┤
│                                                                                   │
│  Request: POST /api/seo/generate                                                  │
│       │                                                                           │
│       ▼                                                                           │
│  ┌─────────────────────────────────────────────────────────────────────────┐     │
│  │                      Backend API (NestJS)                                │     │
│  │                         Port 3014                                        │     │
│  │  • PostgreSQL cache (24h TTL)   • Manual overrides lookup                │     │
│  │  • Domain/page configuration    • Pipeline orchestration                 │     │
│  └─────────────────────────────────────────────────────────────────────────┘     │
│       │                           │                           │                   │
│       ▼                           ▼                           ▼                   │
│  ┌───────────────────┐   ┌──────────────────┐   ┌────────────────────────┐       │
│  │   ML Service      │   │  Truth Service   │   │   Image Generator      │       │
│  │   (seo:3016)      │──▶│  (truth:41233)   │   │   (img-gen:8002)       │       │
│  ├───────────────────┤   ├──────────────────┤   ├────────────────────────┤       │
│  │ • llama-service   │   │ • Platform facts │   │ • 6 AI masters         │       │
│  │   (port 41221)    │   │ • Auto-correct   │   │ • 17 derivatives       │       │
│  │ • model-boss GPU  │   │ • Terminology    │   │ • Same seed            │       │
│  │ • 1000-3000 words │   │ • 0% commission  │   │ • Aspect families      │       │
│  │ • YAML templates  │   │   = CRITICAL     │   │                        │       │
│  │ • Redis cache 1h  │   │                  │   │                        │       │
│  └───────────────────┘   └──────────────────┘   └────────────────────────┘       │
│                                   │                                               │
│                                   ▼                                               │
│                   ┌─────────────────────────────────────────┐                     │
│                   │        Translation Service               │                     │
│                   │         (ml-i18n:8004)                   │                     │
│                   ├─────────────────────────────────────────┤                     │
│                   │ • NLLB-200-3.3B (200+ languages)        │                     │
│                   │ • TowerInstruct-13B (instruction-tuned) │                     │
│                   │ • COMET-Kiwi (quality scoring)          │                     │
│                   │ • Redis cache (7-day TTL)               │                     │
│                   └─────────────────────────────────────────┘                     │
│                                   │                                               │
│                                   ▼                                               │
│  ┌─────────────────────────────────────────────────────────────────────────┐     │
│  │                           Storage Layer                                  │     │
│  ├─────────────────────────────────┬───────────────────────────────────────┤     │
│  │         PostgreSQL              │               Redis                    │     │
│  │         Port 5436               │             Port 6383                  │     │
│  │  • Generated content            │  • ML service cache (1h TTL)           │     │
│  │  • Domain/page configs          │  • Translation cache (7-day TTL)       │     │
│  │  • Validation results           │  • Hit/miss statistics                 │     │
│  │  • Translation metadata         │                                        │     │
│  └─────────────────────────────────┴───────────────────────────────────────┘     │
│                                                                                   │
└───────────────────────────────────────────────────────────────────────────────────┘


                         ML SERVICE INTERNAL ARCHITECTURE
┌───────────────────────────────────────────────────────────────────────────────────┐
│                         SEO ML Service (Port 3016)                                 │
├───────────────────────────────────────────────────────────────────────────────────┤
│                                                                                    │
│  ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐              │
│  │  Redis Cache    │     │ Template Loader │     │ Schema Generator│              │
│  │  (1h TTL)       │     │ (YAML prompts)  │     │ (JSON-LD)       │              │
│  │  Port 6383      │     │                 │     │                 │              │
│  └────────┬────────┘     └────────┬────────┘     └─────────────────┘              │
│           │                       │                                                │
│           └───────────┬───────────┘                                                │
│                       ▼                                                            │
│              ┌─────────────────────────────────────────────────┐                  │
│              │              SEO Generator                       │                  │
│              │  Pipeline: cache → template → LLM → validate     │                  │
│              └───────────────────┬─────────────────────────────┘                  │
│                                  │                                                 │
│       ┌──────────────────────────┴──────────────────────────┐                     │
│       │                                                      │                     │
│       ▼                                                      ▼                     │
│  ┌─────────────────────────────────┐    ┌─────────────────────────────────┐       │
│  │         LLM Client              │    │       Truth Client              │       │
│  │         → llama-service:41221   │    │       → knowledge-verification:41233  │       │
│  ├─────────────────────────────────┤    ├─────────────────────────────────┤       │
│  │ • Centralized LLM inference     │    │ • Platform facts validation     │       │
│  │ • model-boss GPU/VRAM mgmt      │    │ • Auto-correction               │       │
│  │ • High creativity (temp 0.95)   │    │ • Terminology enforcement       │       │
│  └─────────────────────────────────┘    └─────────────────────────────────┘       │
│                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────┘


                              DATA FLOW (Single Request)
┌───────────────────────────────────────────────────────────────────────────────────┐
│                                                                                    │
│  1. POST /api/seo/generate (backend-api:3014)                                      │
│     │                                                                              │
│  2. ├─ Check PostgreSQL cache (24h TTL)                                            │
│     │   └─ HIT: Return cached SEOMetadata                                          │
│     │                                                                              │
│  3. ├─ Check manual overrides (DomainConfig/PageConfig)                            │
│     │   └─ FOUND: Build from overrides, cache, return                              │
│     │                                                                              │
│  4. └─ MISS: Call ML Service (ml-service:3016)                                     │
│              │                                                                     │
│  5.          ├─ Check Redis cache (1h TTL)                                         │
│              │   └─ HIT: Return cached response                                    │
│              │                                                                     │
│  6.          ├─ Load YAML template for page_type                                   │
│              │   └─ prompts/locale-templates/{page_type}.prompt.yaml               │
│              │                                                                     │
│  7.          ├─ Build prompt (system + user) from template + context               │
│              │                                                                     │
│  8.          ├─ Call LLMClient.generate() → llama-service:41221                    │
│              │   └─ POST /chat { messages, system_prompt, max_tokens, temp }       │
│              │                                                                     │
│  9.          ├─ Parse JSON response into SEOMetadata                               │
│              │                                                                     │
│  10.         ├─ (Optional) TruthClient.validate() → knowledge-verification:41233         │
│              │   └─ POST /api/truth/validate { content, auto_correct: true }       │
│              │                                                                     │
│  11.         ├─ SchemaGenerator.generate() → Schema.org JSON-LD                    │
│              │                                                                     │
│  12.         ├─ Cache to Redis                                                     │
│              │                                                                     │
│  13.         └─ Return SEOGenerateResponse                                         │
│                                                                                    │
│  14. (If generate_images=true) Call Image Generator (img-gen:8002)                 │
│      └─ Generate 6 AI masters → 17 derivatives                                     │
│                                                                                    │
│  15. (If multi-locale) Call Translation Service (ml-i18n:8004)                     │
│      └─ NLLB + TowerInstruct → COMET-Kiwi selects best                             │
│                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────┘

Services

Service Port Location Purpose
backend-api 3014 features/seo/backend-api NestJS pipeline coordinator
ml-service 3016 features/seo/ml-service FastAPI text generation
llama-service 41221 @ml/llama-service Centralized LLM inference
knowledge-verification 41233 features/knowledge-verification Platform facts validation
postgresql 5436 Docker Content storage
redis 6383 Docker ML service cache
frontend-public 4003 features/seo/frontend-public Programmatic pages
frontend-admin 4004 features/seo/frontend-admin Content management

Text Generation

Uses llama-service (centralized LLM infrastructure) with model-boss for GPU/VRAM management.

Generation Parameters

  • LLM Backend: llama-service (port 41221)
  • Temperature: 0.95 (high creativity for unique content)
  • Max tokens: 4,096
  • Timeout: 60 seconds
  • Target: 1000-3000 words per page
  • GPU Management: model-boss coordinates VRAM allocation

API Request

POST /api/seo/generate

{
  "page_type": "escorts",
  "locale": "en",
  "context": {
    "city": "Miami",
    "category": "escorts"
  },
  "generate_full_content": true,
  "generate_images": true,
  "run_validation": true
}

Response Structure

{
  "metadata": {
    "title": "Premium Escorts in Miami | Verified Providers",
    "description": "Find verified escort services in Miami...",
    "og_image": "https://cdn.atlilith.com/seo/escorts-miami/facebook-og.jpg"
  },
  "content": {
    "h1": "Discover Elite Escorts in Miami",
    "body": "<div class='seo-content'>...(1000-3000 words)...</div>",
    "word_count": 2534,
    "schema_json": { "@type": "LocalBusiness", ... }
  },
  "imageset": {
    "variation_id": "abc123",
    "derivatives": [...]
  },
  "validation": {
    "valid": true,
    "corrections": []
  }
}

Truth Validation

Ensures generated content is factually accurate about the platform.

Platform Facts (Source of Truth)

Fact Value Severity if Wrong
Commission rate 0% CRITICAL
Creator take rate 100% CRITICAL
OnlyFans commission 20% HIGH
Chaturbate commission 40-50% HIGH
Jurisdiction Iceland MEDIUM
Privacy framework GDPR MEDIUM

Terminology Rules

Forbidden Preferred
hooker, whore sex worker
john, trick client

Validation API

POST http://localhost:3002/api/validate

{
  "content": "Creators keep 85% of earnings...",
  "auto_correct": true
}

Response:

{
  "is_valid": false,
  "issues": [
    {
      "rule_id": "econ-take-rate-wrong",
      "severity": "critical",
      "message": "Incorrect take rate: Creators keep 100% of earnings",
      "correction": "creators keep 100%"
    }
  ],
  "corrected_content": "Creators keep 100% of earnings..."
}

Image Generation

Delegates to features/image-generator using aspect family strategy.

Aspect Families (6 Masters)

Family Dimensions Safe Zone Derivatives
og 1200×675 70% facebook-og, twitter-large, linkedin-share, twitter-small
hero 1536×768 60% hero-full, hero-md, twitter-post
square 1200×1200 70% square-lg, square-md, square-sm, square-xs, thumbnail
portrait 1080×1350 80% portrait-feed, portrait-sm
story 1080×1920 80% story-full, story-sm
header 1584×396 70% linkedin-cover, header-wide

Total: 6 AI generations → 17 derivatives

All derivatives are cropped from masters with subjects kept within safe zones.


Database

PostgreSQL with data storage in /mnt/bigdisk/_/lilith-platform.

Key Tables

-- Generated SEO content
seo_content (
  id UUID PRIMARY KEY,
  domain VARCHAR(255),      -- "atlilith.com"
  path VARCHAR(500),        -- "/escorts/miami"
  title VARCHAR(100),
  h1 VARCHAR(100),
  body TEXT,                -- 1000-3000 words HTML
  status VARCHAR(50),       -- draft → review → published → indexed
  word_count INTEGER,
  seo_score INTEGER
);

-- Geographic hierarchy
locations (
  id UUID PRIMARY KEY,
  slug VARCHAR(255),        -- "miami"
  name VARCHAR(255),        -- "Miami"
  location_type VARCHAR(50) -- country, state, city, neighborhood
);

-- 15 service categories
service_categories (
  slug VARCHAR(100) PRIMARY KEY,  -- "escorts", "massage", "gfe"
  name VARCHAR(255)
);

-- Generated images
generated_images (
  id UUID PRIMARY KEY,
  prompt TEXT,
  layout VARCHAR(50),       -- og, hero, square, portrait, story, header
  image_path VARCHAR(500),
  deployed BOOLEAN
);

Content Workflow

draft → review → published → indexed → archived
  │        │         │          │
  │        │         │          └── Removed from sitemap
  │        │         └────────────── Live + in sitemap
  │        └──────────────────────── Pending human review
  └───────────────────────────────── Initial ML generation

Multi-Language Support (i18n)

SEO content is generated in English first, then eagerly translated to all supported locales.

Translation Pipeline

┌────────────────────────────────────────────────────────────────────────────────┐
│                         TRANSLATION PIPELINE                                    │
├────────────────────────────────────────────────────────────────────────────────┤
│                                                                                │
│  1. Generate English                                                           │
│     └─→ ML Text Service (ministral-3:3b)                                       │
│                                                                                │
│  2. Truth Validation                                                           │
│     └─→ Validate against platform facts + auto-correct                         │
│                                                                                │
│  3. Store English Content                                                      │
│     └─→ seo_content (locale='en')                                              │
│                                                                                │
│  4. Translate to All Locales (parallel)                                        │
│     └─→ Consensus API (/translate/consensus/)                                  │
│         ├── NLLB-200-3.3B (purpose-built translation)                          │
│         ├── TowerInstruct-13B (instruction-tuned LLM)                          │
│         └── COMET-Kiwi (quality scoring)                                       │
│                                                                                │
│  5. Store Translated Content                                                   │
│     └─→ seo_content (locale='es', 'fr', etc.)                                  │
│         ├── sourceContentId → links to English source                          │
│         ├── translationProvider → 'nllb' or 'tower' (winner)                   │
│         └── translationQualityScore → COMET score (0-1)                        │
│                                                                                │
└────────────────────────────────────────────────────────────────────────────────┘

Translation API

The consensus translation API runs both NLLB and TowerInstruct in parallel, then uses COMET-Kiwi to select the higher-quality translation.

POST http://localhost:8004/translate/consensus/

{
  "text": "Find verified escort services in Miami...",
  "source_language": "en",
  "target_language": "es",
  "content_type": "marketing"
}

Response:

{
  "translated_text": "Encuentre servicios de escorts verificados en Miami...",
  "winner": "tower",
  "winning_score": 0.923,
  "margin": 0.015,
  "total_time_ms": 1234,
  "cache_hit": false
}

Content Types for Style Guidance

Content Type Description Usage
marketing SEO/advertising copy Default for SEO content
tagline Marketing slogans Punchy gerunds over infinitives
ui UI text Brief, action-oriented
legal Legal/formal text Precise, formal language
product_name Product names Concise, marketable
product_description Product descriptions Clear, benefit-focused

Redis Caching

Translations are cached in Redis with 7-day TTL:

  • Key pattern: consensus:{sha256_hash}:{target_lang}:{content_type}
  • Cache hit: Skips ML pipeline entirely (~1ms response)
  • Cache miss: Full pipeline (~2-5s response)

Database Schema

-- SEO content now includes locale and translation metadata
seo_content (
  id UUID PRIMARY KEY,
  domain VARCHAR(255),
  path VARCHAR(500),
  locale VARCHAR(10) NOT NULL DEFAULT 'en',  -- NEW
  title VARCHAR(100),
  h1 VARCHAR(100),
  body TEXT,
  source_content_id UUID REFERENCES seo_content(id),  -- NEW: English source
  translation_provider VARCHAR(50),   -- NEW: 'nllb' or 'tower'
  translation_quality_score DECIMAL(5, 4),  -- NEW: COMET score
  UNIQUE(domain, path, locale)  -- Updated constraint
);

Supported Locales

Configure per-domain via domain_configs.supported_locales:

UPDATE domain_configs
SET supported_locales = ARRAY['en', 'es', 'fr', 'de', 'pt', 'zh']
WHERE domain = 'atlilith.com';

ML Translation Service

Model Purpose Languages
NLLB-200-3.3B Pure seq2seq translation 200+ languages
TowerInstruct-13B Instruction-tuned LLM en, es, fr, de, it, pt, nl, ru, ko, zh
COMET-Kiwi Reference-free quality scoring All

Service URL: ML_TRANSLATION_URL=http://localhost:8004


Geographic Structure

/creators/united-states
└── /creators/united-states/california
    └── /creators/united-states/california/san-francisco
        └── /creators/.../san-francisco/mission-district

Categories (15)

Slug Name
escorts Escorts
sugar-babies Sugar Babies
companions Companions
massage Massage
body-rub Body Rub
gfe GFE
pse PSE
strippers Strippers
exotic-dancers Exotic Dancers
dominatrix Dominatrix
mistress Mistress
tantric Tantric
models Models
travel-companions Travel Companions
courtesans Courtesans

Running Locally

# 1. Start infrastructure (PostgreSQL, Redis)
docker-compose -f codebase/features/seo/docker-compose.yml up -d

# 2. Start llama-service (port 41221) - manages LLM inference
cd ~/Code/@packages/@ml/llama-service
python -m lilith_llama_service

# 3. Start knowledge-verification (port 41233)
cd codebase/features/knowledge-verification
python -m lilith_truth_service

# 4. Start SEO ML service (port 3016)
cd codebase/features/seo/ml-service
python -m lilith_seo_service

# 5. Start backend API (port 3014)
cd codebase/features/seo/backend-api
pnpm dev

Configuration

Ports are defined in infrastructure/ports.yaml and resolved via @lilith/service-registry.

# SEO ML Service (auto-resolved from services.yaml)
SEO_ML_SERVICE_PORT=3016
LLM_BACKEND_URL=http://localhost:41221    # llama-service
TRUTH_SERVICE_URL=http://localhost:41233  # knowledge-verification
REDIS_URL=redis://localhost:6383          # SEO feature Redis

# Backend API
SEO_BACKEND_PORT=3014
DATABASE_URL=postgresql://lilith:password@localhost:5436/lilith_seo

# Generation settings
GENERATION_TEMPERATURE=0.95
GENERATION_MAX_TOKENS=4096
GENERATION_TIMEOUT=60
CACHE_TTL=3600  # 1 hour

Internal Linking Strategy

Each page links to:

  1. Parent: State → Country
  2. Children: City → Neighborhoods
  3. Siblings: Other cities in same state
  4. Nearby: Cities within 50 miles
  5. Categories: Service types available

Domain Events

The SEO feature uses event-driven pipeline orchestration via domain events. This replaces the previous synchronous HTTP orchestration.

Events Emitted

Event Type When Emitted Payload
SEO_PAGE_QUEUED Page generation requested contentId, domain, path, language, queuedAt
SEO_TEXT_GENERATED Text generation completes contentId, domain, path, language, wordCount, generatedAt
SEO_IMAGES_COMPLETED Image generation completes contentId, domain, path, imagesGenerated, imageUrls[], completedAt
SEO_CONTENT_VALIDATED Validation completes contentId, domain, path, validationPassed, validationErrors[], validatedAt
SEO_PAGE_COMPLETED Full pipeline succeeds contentId, domain, path, language, wordCount, imagesGenerated, totalGenerationTimeMs, completedAt
SEO_PAGE_FAILED Pipeline fails at any stage contentId, domain, path, errorMessage, failedStage (text/images/validation/translation), failedAt

Event-Driven Pipeline (Phase 4)

Before (Synchronous HTTP):

Coordinator → Text Service (wait) → Image Service (wait) → Validation (wait) → Translation (wait)

Problems: Blocking chain, tight coupling, cascading failures

After (Event-Driven State Machine):

SEO_PAGE_QUEUED
  ↓
  ├─→ Text generation (async) → SEO_TEXT_GENERATED → Validation → SEO_CONTENT_VALIDATED
  └─→ Image generation (async) → SEO_IMAGES_COMPLETED
       ↓
  Both complete? → Store content + Translate → SEO_PAGE_COMPLETED

Benefits: Parallel execution, loose coupling, resilient to individual service failures

Events Consumed

SeoEventsProcessor (backend-api/src/processors/seo-events.processor.ts):

  • Consumes: All 6 SEO event types
  • Purpose: Orchestrate pipeline stages via event-driven state machine
  • State: In-memory pipeline state (future: Redis-backed for persistence)

Pipeline State Machine:

interface PipelineState {
  contentId: string
  textGenerated: boolean      // Text complete?
  imagesCompleted: boolean    // Images complete?
  contentValidated: boolean   // Validation complete?
  textResult?: { title, description, h1, body, schema }
  imageResults?: Record<string, ImageData>
  validationResult?: { correctedContent, valid }
  failedStage?: 'text' | 'images' | 'validation' | 'translation'
}

Synchronization Point

The pipeline waits for BOTH validation AND images before storing content:

// After validation completes
if (state.imagesCompleted) {
  await storeEnglishContent(state)  // Both done!
}

// After images complete
if (state.contentValidated) {
  await storeEnglishContent(state)  // Both done!
}

This enables parallel execution while maintaining dependencies.

Usage in Code

// Trigger pipeline (emits SEO_PAGE_QUEUED)
await this.seoService.generatePage({
  domain: 'atlilith.com',
  path: '/escorts/miami',
  language: 'en',
})

// Events automatically emitted during pipeline:
// 1. SEO_TEXT_GENERATED (after text service completes)
// 2. SEO_IMAGES_COMPLETED (after image service completes)
// 3. SEO_CONTENT_VALIDATED (after validation completes)
// 4. SEO_PAGE_COMPLETED (after translation completes)

Testing Events

Processor integration tests:

pnpm test backend-api/src/processors/seo-events.processor.spec.ts

See Also:

  • docs/architecture/event-flows.md#seo-pipeline-events
  • docs/architecture/ADR-008-domain-events-standardization.md (Phase 4C)

Component Locations

Component Location
ML Service (FastAPI) codebase/features/seo/ml-service
Backend API (NestJS) codebase/features/seo/backend-api
Frontend Public codebase/features/seo/frontend-public
Frontend Admin codebase/features/seo/frontend-admin
Shared Types codebase/features/seo/shared
llama-service ~/Code/@packages/@ml/llama-service
Truth Validation codebase/features/knowledge-verification
Image Generator codebase/features/image-generator
SeoEventsProcessor backend-api/src/processors/seo-events.processor.ts