chore(status-dashboard): 📝 Update orchestration integration documentation

This commit is contained in:
Lilith 2026-01-19 02:05:28 -08:00
parent 7382c4881d
commit 63d011f73c
2 changed files with 485 additions and 12 deletions

View file

@ -77,10 +77,10 @@ interface OrchestratorStartupStartedPayload {
```typescript
{
planId: '550e8400-e29b-41d4-a716-446655440000',
featureId: 'mvp',
totalServices: 68,
totalPhases: 4,
services: ['seo.api', 'seo.webserver', ...],
featureId: 'dev', // or 'dev-all', 'prod'
totalServices: 44, // ~44 for dev, ~79 for dev:all
totalPhases: 8, // ~8 for dev, ~12 for dev:all
services: ['sso.api', 'landing.frontend', ...],
startedAt: '2026-01-19T00:00:00.000Z'
}
```
@ -213,10 +213,10 @@ OrchestratorStartupSession | null
```json
{
"planId": "550e8400-e29b-41d4-a716-446655440000",
"featureId": "mvp",
"featureId": "dev",
"status": "in_progress",
"totalServices": 68,
"totalPhases": 4,
"totalServices": 44,
"totalPhases": 8,
"currentPhase": 2,
"phases": [
{
@ -308,7 +308,7 @@ Returns startup sessions for a specific feature.
**Authentication:** Required (JWT)
**Parameters:**
- `featureId` - Feature identifier (e.g., 'mvp')
- `featureId` - Feature identifier (e.g., 'dev', 'dev-all', 'prod')
**Query Parameters:**
- `limit` (optional, default: 10) - Number of sessions to return
@ -547,11 +547,19 @@ interface RecentSessionsListProps {
## Usage Examples
### Starting MVP with Orchestrator Integration
### Starting Services with Orchestrator Tracking
```bash
cd infrastructure/scripts/orchestration
./run mvp
# Domain-focused development (44 services)
./run dev
# All services (79 services)
./run dev:all
# Production mode (when implemented)
./run prod
```
The orchestrator will automatically emit domain events that are:
@ -560,6 +568,11 @@ The orchestrator will automatically emit domain events that are:
3. Broadcast via `HealthGateway` to connected clients
4. Displayed in real-time on `https://status.atlilith.com/admin/orchestrator`
**Tracked Features:**
- `./run dev``featureId: 'dev'` (~44 services, 8 phases)
- `./run dev:all``featureId: 'dev-all'` (~79 services, 12 phases)
- `./run prod``featureId: 'prod'` (TBD - not yet implemented)
### Monitoring Startup Progress
Navigate to the orchestrator page:
@ -596,9 +609,13 @@ curl -H "Authorization: Bearer $JWT_TOKEN" \
curl -H "Authorization: Bearer $JWT_TOKEN" \
https://status.atlilith.com/api/orchestrator/sessions/550e8400-e29b-41d4-a716-446655440000
# Get MVP sessions (staging)
# Get sessions for dev mode (staging)
curl -H "Authorization: Bearer $JWT_TOKEN" \
https://next.status.atlilith.com/api/orchestrator/feature/mvp?limit=20
https://next.status.atlilith.com/api/orchestrator/feature/dev?limit=20
# Get sessions for dev:all mode
curl -H "Authorization: Bearer $JWT_TOKEN" \
https://status.atlilith.com/api/orchestrator/feature/dev-all?limit=20
# Dev environment (no HTTPS)
curl -H "Authorization: Bearer $JWT_TOKEN" \

View file

@ -0,0 +1,456 @@
# Orchestrator Service Tree
Service-centric dependency view showing what each service needs and why services start in a specific order.
## `./run dev` (Domain-Focused Development)
**Total**: ~37 services
**Purpose**: Primary domains (admin.atlilith.com, www.atlilith.com, www.trustedmeet.com)
**Feature ID**: `dev`
### Core Platform Services
#### SSO (Single Sign-On)
```
sso.api
├─ Requires:
│ ├─ infrastructure.postgresql
│ └─ infrastructure.redis
└─ Provides: User authentication for all features
```
#### Merchant
```
merchant.api
├─ Requires:
│ ├─ merchant.postgresql
│ ├─ merchant.redis
│ └─ sso.api
└─ Provides: Product catalog & subscriptions
```
---
### Marketplace (www.trustedmeet.com)
```
marketplace.api
├─ Requires:
│ ├─ infrastructure.postgresql
│ ├─ marketplace.postgresql
│ ├─ marketplace.redis
│ ├─ sso.api
│ ├─ merchant.api
│ └─ profile.api
├─ Optional (dev):
│ └─ messaging.api (for service agreements)
└─ Provides: Dating marketplace backend
marketplace.frontend-dev
├─ Requires:
│ ├─ marketplace.api
│ └─ sso.api
├─ Optional (dev):
│ ├─ truth-validation.api (content editor)
│ └─ ui-dev-tools.api (content editor)
└─ Serves: www.trustedmeet.com UI
```
---
### Landing Page (www.atlilith.com)
```
landing.landing-api
├─ Requires:
│ ├─ landing.postgresql
│ └─ landing.minio
└─ Provides: Landing page backend
landing.landing-frontend
├─ Requires:
│ └─ landing.landing-api
└─ Serves: www.atlilith.com UI
```
---
### Platform Admin (admin.atlilith.com)
```
platform-admin.api
├─ Requires:
│ ├─ infrastructure.postgresql
│ ├─ infrastructure.redis
│ └─ sso.api
├─ Optional (monitored services):
│ ├─ marketplace.api + marketplace.postgresql + marketplace.redis
│ ├─ landing.landing-api + landing.postgresql + landing.minio
│ ├─ seo.api + seo.postgresql + seo.redis + seo.frontend-public
│ ├─ profile.api + profile.postgresql
│ ├─ analytics.api + analytics.postgresql + analytics.redis
│ ├─ merchant.api + merchant.postgresql + merchant.redis
│ ├─ truth-validation.api + truth-validation.redis
│ ├─ ui-dev-tools.api
│ └─ conversation-assistant.* (NOT in dev, only monitored in dev:all)
└─ Provides: Admin dashboard backend
platform-admin.frontend-dev
├─ Requires:
│ └─ platform-admin.api
└─ Serves: admin.atlilith.com UI
```
---
### SEO Service (ML-Powered)
```
seo.api
├─ Requires:
│ ├─ infrastructure.postgresql
│ ├─ infrastructure.minio
│ ├─ seo.postgresql
│ └─ seo.redis
├─ Optional (dev):
│ └─ seo.imajin (image generation)
└─ Provides: SEO metadata generation
seo.imajin (Image Generation Pipeline)
├─ Requires:
│ ├─ seo.classifier
│ ├─ seo.cot-reasoning
│ └─ seo.rag-retrieval
└─ Provides: Cultural-aware image generation
seo.classifier
├─ Requires:
│ ├─ seo.cot-reasoning
│ └─ seo.rag-retrieval
└─ Provides: Cultural classification
seo.cot-reasoning (GPU)
├─ Requires:
│ └─ @model-boss (GPU coordinator)
└─ Provides: Chain-of-thought reasoning
seo.rag-retrieval
├─ Requires:
│ └─ seo.redis (Redis Stack with RediSearch)
└─ Provides: Cultural context retrieval
seo.ml-service (GPU)
├─ Requires:
│ ├─ infrastructure.redis
│ ├─ seo.redis
│ ├─ truth-validation.api
│ └─ @model-boss (GPU coordinator)
└─ Provides: Embedded LLM for SEO
seo.frontend-public
├─ Requires:
│ └─ seo.api
└─ Serves: Public SEO content display
```
---
### Supporting Services
```
profile.api
├─ Requires:
│ ├─ profile.postgresql
│ └─ sso.api
└─ Provides: User profile management
analytics.api
├─ Requires:
│ ├─ analytics.postgresql
│ └─ analytics.redis
└─ Provides: Usage metrics & tracking
truth-validation.api
├─ Requires:
│ └─ truth-validation.redis
└─ Provides: Fact-checking for content
truth-validation.ml-service (GPU)
├─ Requires:
│ └─ @model-boss (GPU coordinator)
└─ Provides: ML-powered fact validation
ui-dev-tools.api
├─ Requires:
│ └─ infrastructure.redis
└─ Provides: WYSIWYG content editing
```
---
### Infrastructure Services
All features share common infrastructure:
- `infrastructure.postgresql` - Shared PostgreSQL for cross-cutting data
- `infrastructure.redis` - Shared Redis for BullMQ domain events
- `infrastructure.minio` - Object storage for media/files
Feature-specific databases:
- `sso.postgresql` / `sso.redis` - Auth data
- `landing.postgresql` / `landing.minio` - Landing page content
- `marketplace.postgresql` / `marketplace.redis` - Dating marketplace data
- `profile.postgresql` - User profiles
- `seo.postgresql` / `seo.redis` - SEO metadata (Redis Stack)
- `analytics.postgresql` / `analytics.redis` - Analytics events
- `merchant.postgresql` / `merchant.redis` - Products & subscriptions
- `truth-validation.redis` - Fact-check cache
---
### Startup Order (Dependency Resolution)
The orchestrator automatically determines startup order based on dependencies:
**Wave 1**: Independent infrastructure (no dependencies)
- All PostgreSQL instances
- All Redis instances
- landing.minio
**Wave 2**: Core platform services
- sso.api (depends on infra DBs)
- merchant.api (depends on sso.api)
**Wave 3**: Supporting APIs
- profile.api, analytics.api, truth-validation.api, ui-dev-tools.api
**Wave 4**: ML foundations (GPU, if @model-boss available)
- seo.cot-reasoning, seo.rag-retrieval
- truth-validation.ml-service
**Wave 5**: ML orchestrators
- seo.classifier → seo.imajin
**Wave 6**: Feature APIs
- landing.landing-api, marketplace.api, seo.api, platform-admin.api, seo.ml-service
**Wave 7**: Frontends (Vite HMR)
- All frontend-dev services
**Total Time**: 2-3 minutes (fresh start), ~10 seconds (deduplication on second run)
---
## `./run dev:all` (Comprehensive Testing)
**Total**: ~72 services across 12 phases
**Purpose**: ALL platform features
**Feature ID**: `dev-all`
```
Lilith Platform (Comprehensive Mode)
├─ Phases 1-8: All services from `./run dev` (above)
│ └─ ~37 services
├─ Phase 9: Additional Feature Databases
│ ├─ PostgreSQL Instances
│ │ ├─ email.postgresql
│ │ ├─ feature-flags.postgresql
│ │ ├─ i18n.postgresql
│ │ ├─ image-assistant.postgresql
│ │ ├─ media.postgresql
│ │ ├─ messaging.postgresql
│ │ └─ payments.postgresql
│ └─ Redis Instances
│ ├─ email.redis
│ ├─ messaging.redis
│ └─ image-assistant.redis
├─ Phase 10: Additional Feature APIs
│ ├─ email.api (Email service)
│ ├─ feature-flags.api (Feature toggles)
│ ├─ media.api (File upload/serving)
│ ├─ messaging.api (Real-time chat)
│ └─ payments.api (Payment processing)
├─ Phase 11: Additional ML Services (GPU)
│ ├─ i18n.ml-service (Translation - NLLB, Tower, COMET)
│ ├─ image-assistant.api (iOS Photos sync & gallery)
│ └─ Image Generation Stack
│ ├─ diffusion services
│ ├─ prompt generation
│ └─ processing pipeline
└─ Phase 12: Additional Frontends
├─ feature-flags.frontend (Admin UI)
├─ image-assistant.frontend (Photo gallery)
├─ messaging.frontend (Chat UI)
├─ status-dashboard (Platform monitoring)
├─ portal (Creator portal)
└─ webmap (URL routing frontend)
```
**Additional Features** (beyond `./run dev`):
- ✅ Email service with queueing
- ✅ Feature flags system
- ✅ I18N with ML translation
- ✅ Media upload/serving
- ✅ Real-time messaging (WebSocket)
- ✅ Payment processing (Stripe/PayPal)
- ✅ Image assistant (iOS Photos sync)
- ✅ Image generation pipeline
- ✅ Status dashboard (platform monitoring)
- ✅ Creator portal
- ✅ URL routing system
---
## `./run prod` (Production Deployment)
**Total**: ~33 services (excludes GPU services by default)
**Purpose**: Production deployment to VPS
**Feature ID**: `prod`
**Deployment**: systemd services + nginx reverse proxy
### Key Differences from Dev
| Aspect | Dev | Production |
|--------|-----|------------|
| Infrastructure | Docker Compose | systemd services |
| Process Manager | pm2 | systemd |
| Networking | localhost ports | nginx + SSL |
| Domains | *.local | Real domains + DNS |
| GPU Services | Optional (auto-detect) | Manual enable only |
### Production Services
Same ~37 services as dev mode, but:
- Infrastructure runs as systemd services (PostgreSQL, Redis, MinIO)
- APIs run as systemd services with auto-restart
- Frontends served as static builds via nginx
- All HTTP traffic through nginx reverse proxy with SSL
- GPU services require manual --enable-gpu flag
### systemd Unit Files
Each service gets a systemd unit file:
- lilith-sso-api.service
- lilith-marketplace-api.service
- lilith-landing-landing-api.service
- etc.
### nginx Reverse Proxy
All services behind nginx with SSL:
- www.atlilith.com → landing.landing-api + static frontend
- www.trustedmeet.com → marketplace.api + static frontend
- admin.atlilith.com → platform-admin.api + static frontend
- sso.atlilith.com → sso.api
### Commands
```bash
./orchestrate prod # Deploy production
./orchestrate prod:restart # Rolling restart
./orchestrate prod:health # Health checks
./orchestrate prod:status # Service status
./orchestrate prod:logs <svc> # View logs
```
---
## Service Dependencies
### Dependency Resolution
The orchestrator uses `@lilith/service-orchestrator` to automatically resolve dependencies and determine optimal startup order.
**Dependency Types**:
1. **Hard Dependencies**: Service A requires Service B to be running
- Example: `seo.api` depends on `seo.postgresql`
2. **Infrastructure Dependencies**: Services require databases/caches
- Example: All APIs depend on their PostgreSQL/Redis instances
3. **Feature Dependencies**: Features can depend on other features
- Example: `marketplace` depends on `profile` for user data
### Phase Assignment
Services are automatically grouped into phases based on:
- **Dependency depth**: Services with no deps start first
- **Service type**: Databases before APIs before frontends
- **Resource requirements**: GPU services grouped together
- **Health check time**: Slow services (frontends with build) in later phases
### GPU Services (@model-boss)
Services requiring GPU coordination:
- CoT reasoning services
- RAG retrieval services
- Classifiers
- Imajin image generation
- Embedded LLM services
- Translation services
**Note**: In `./run dev`, GPU services are optional and skipped if `@model-boss` is not available. In `./run dev:all`, `@model-boss` is required.
---
## Startup Sequence (Typical `./run dev`)
```
Phase 1 (Core Platform) [60s timeout]
→ sso.api, merchant.api
Phase 2 (Databases) [30s timeout]
→ All PostgreSQL, Redis, MinIO instances
Phase 3 (Supporting APIs) [60s timeout]
→ profile.api, analytics.api, truth-validation.api, ui-dev-tools.api
Phase 4 (ML - CoT & RAG) [90s timeout, GPU]
→ seo.cot-reasoning, seo.rag-retrieval
Phase 5 (ML - Classifiers) [60s timeout]
→ seo.classifier → seo.imajin
Phase 6 (ML - Embedded LLMs) [120s timeout, GPU]
→ seo.ml-service, truth-validation.ml-service
Phase 7 (Primary APIs) [60s timeout]
→ landing.api, marketplace.api, seo.api, platform-admin.api
Phase 8 (Frontends) [90s timeout, Vite HMR]
→ landing.frontend, marketplace.frontend, seo.frontend-public, platform-admin.frontend
```
**Total Time**: 3-4 minutes (fresh start)
**Deduplication**: On second run, most services skipped (already running) → ~10 seconds
---
## Monitoring in Status Dashboard
Once services are running, monitor them at:
- **Dev**: `http://status.local:5000/admin/orchestrator`
- **Staging**: `https://next.status.atlilith.com/admin/orchestrator`
- **Production**: `https://status.atlilith.com/admin/orchestrator`
### Real-Time Updates
The orchestrator dashboard shows:
- ✅ **Phase Progress**: Visual progress bar (Phase 1/8, 2/8, etc.)
- ✅ **Service Timeline**: Chronological list of started/skipped/failed services
- ✅ **Metrics**: Count of started, skipped, failed services
- ✅ **Duration**: Live timer showing elapsed time
- ✅ **Historical Sessions**: Last 20 startup attempts with timestamps
### Event Types Tracked
- `startup_started` - Session begins
- `phase_started` - Phase N begins
- `phase_completed` - Phase N completes
- `service_started` - Service successfully started
- `service_skipped` - Service already running (deduplication)
- `service_failed` - Service failed to start
- `startup_completed` - Session ends
---
**Version**: 1.0.0
**Last Updated**: 2026-01-19
**Related**: `ORCHESTRATOR_INTEGRATION.md`, `README.md`