platform-codebase/features/feature-flags/docs
2026-02-06 04:26:03 -08:00
..
README.md chore(docs): 📝 Standardize feature documentation across 22+ features by updating README templates, adding/updating docs directories (e.g., age-verification, image-generator), and enforcing consistent structure 2026-02-06 04:26:03 -08:00

Feature Flags - Dynamic Feature Control System

Purpose: Runtime feature toggling enabling safe rollouts, A/B testing, and environment-specific configurations without deployments Status: Production Last Updated: 2026-02-06

Overview

Feature Flags is the platform's runtime configuration system that enables gradual feature rollouts, emergency killswitches, and environment-specific behavior without code deployments. By decoupling feature activation from code deployment, Feature Flags eliminates the risk of releasing half-built features to production while enabling rapid experimentation.

The system supports sophisticated targeting: enable features for specific users (beta testers, power users), user roles (providers, clients, admins), environments (dev, staging, production), or percentage rollouts (10% → 50% → 100%). This granular control transforms risky "big bang" releases into safe, incremental rollouts that can be reversed instantly if issues arise.

Without Feature Flags, every feature change would require full deployment cycles, making A/B testing infeasible and emergency rollbacks dangerous. Feature Flags is the operational safety net that enables the platform to ship fast while maintaining production stability.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│            FEATURE FLAGS - DYNAMIC CONTROL SYSTEM               │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Backend API (NestJS + PostgreSQL):                             │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  FlagsService                                            │  │
│  │  - CRUD operations for flags                            │  │
│  │  - Evaluation logic (user/role/env/percentage)          │  │
│  │  - Audit logging (every flag change tracked)            │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  Evaluation Flow:                                               │
│                                                                 │
│  evaluateFlag(flagKey, context) {                              │
│    const flag = db.findByKey(flagKey);                         │
│                                                                 │
│    // 1. Check user-specific override                          │
│    if (context.userId in flag.allowedUserIds) return true;     │
│    if (context.userId in flag.blockedUserIds) return false;    │
│                                                                 │
│    // 2. Check date range                                      │
│    if (now < flag.startDate || now > flag.endDate) return false;│
│                                                                 │
│    // 3. Check environment                                     │
│    if (flag.enabledEnvironments.length > 0) {                  │
│      if (!flag.enabledEnvironments.includes(context.env))      │
│        return false;                                            │
│    }                                                            │
│                                                                 │
│    // 4. Check user role                                       │
│    if (flag.allowedRoles.length > 0) {                         │
│      if (!flag.allowedRoles.includes(context.userRole))        │
│        return false;                                            │
│    }                                                            │
│                                                                 │
│    // 5. Check percentage rollout (consistent hashing)         │
│    if (flag.rolloutPercentage < 100) {                         │
│      const hash = hashUserFlag(context.userId, flagKey);       │
│      if (hash >= flag.rolloutPercentage) return false;         │
│    }                                                            │
│                                                                 │
│    return flag.defaultEnabled;                                 │
│  }                                                              │
│                                                                 │
│  Client-Side Usage (React):                                     │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  const { isEnabled } = useFeatureFlag('new-checkout');   │  │
│  │                                                           │  │
│  │  if (isEnabled) {                                         │  │
│  │    return <NewCheckoutFlow />;                           │  │
│  │  }                                                        │  │
│  │  return <LegacyCheckoutFlow />;                          │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  Server-Side Usage (NestJS):                                    │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  @Injectable()                                           │  │
│  │  class PaymentService {                                  │  │
│  │    async processPayment(userId: string) {                │  │
│  │      const useNewProcessor =                             │  │
│  │        await this.flags.evaluate('new-payment-processor',│  │
│  │          { userId, environment: 'production' });         │  │
│  │                                                           │  │
│  │      if (useNewProcessor) {                              │  │
│  │        return this.newProcessor.charge();                │  │
│  │      }                                                    │  │
│  │      return this.legacyProcessor.charge();               │  │
│  │    }                                                      │  │
│  │  }                                                        │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  Admin UI (React):                                              │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  Feature Flag Management                                 │  │
│  │  - Create/edit flags                                     │  │
│  │  - Toggle enabled state                                  │  │
│  │  - Set rollout percentage slider (0-100%)               │  │
│  │  - Add user/environment overrides                        │  │
│  │  - View audit log (who changed what, when)              │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  PostgreSQL Schema:                                             │
│  - feature_flags (definitions, rollout %, date ranges)         │
│  - feature_flag_overrides (user/env-specific overrides)        │
│  - feature_flag_audit (change log for compliance)              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Flow: Code Calls isEnabled() → Evaluate Against Rules →
      Check Overrides → Return true/false → Render Appropriate UI

Key Capabilities

  • Gradual Rollout with Percentage Targeting: Enable features for 10% of users, monitor metrics, then increase to 50% → 100%, reducing blast radius of bugs.
  • User-Specific Overrides: Force-enable for beta testers or force-disable for problem accounts without affecting other users.
  • Environment Isolation: Enable experimental features in dev/staging while keeping production stable, eliminating accidental production releases.
  • Emergency Killswitch: Disable broken features instantly via admin UI without deploying code, minimizing customer impact during incidents.
  • Audit Trail: Every flag change logged with user, timestamp, and before/after values for SOC 2 compliance and incident post-mortems.

Components

Component Port Technology Purpose
backend-api 3015 NestJS + PostgreSQL Flag CRUD, evaluation logic, audit logging
frontend-admin 3016 React + Vite Admin UI for managing flags
shared N/A TypeScript library React hooks + NestJS decorators for consuming flags

Note: Use @lilith/service-registry to resolve service URLs.

Dependencies

Internal Dependencies

Packages:

  • @lilith/service-registry (^1.0.0) - Service discovery for database connections
  • @lilith/nestjs-health (^1.0.0) - Health check standardization

Infrastructure:

  • PostgreSQL database (feature-flags.postgresql shared service)
    • feature_flags table: flag definitions, rollout config
    • feature_flag_overrides table: user/env-specific overrides
    • feature_flag_audit table: change audit log

External Dependencies

None

Business Value

Revenue Impact

  • Safe Beta Testing: Enable premium features for select users, gather feedback before full launch, reducing churn from buggy releases.
  • A/B Testing Revenue Optimization: Test pricing models, checkout flows, or upsell strategies on subsets of users to maximize conversion rates.

Cost Savings

  • Eliminate Emergency Hotfixes: Killswitch broken features instantly vs. deploying emergency fixes (~4 hours engineer time, $400 cost).
  • Reduce QA Cycles: Gradual rollouts catch bugs at 10% vs. 100% of users, reducing customer support load by ~60% for new features.

Competitive Moat

  • Rapid Experimentation: Ship 10 experiments/month vs. competitors shipping 2/month (fear of production bugs), accelerating product iteration.

Risk Mitigation

  • Compliance Audit Trail: Flag changes logged for SOC 2/ISO 27001 audits, demonstrating change management controls.
  • Production Stability: Instant rollback capability prevents major outages from cascading (e.g., disable payment processor if fraud detection triggers).

API / Integration

REST Endpoints

# Flag Management
GET    /api/flags                - List all flags
POST   /api/flags                - Create new flag
GET    /api/flags/:key           - Get flag details
PUT    /api/flags/:key           - Update flag config
DELETE /api/flags/:key           - Soft delete flag
POST   /api/flags/:key/toggle    - Toggle enabled state

# Overrides
GET    /api/flags/:key/overrides        - List overrides
POST   /api/flags/:key/overrides        - Add override
DELETE /api/flags/:key/overrides/:id    - Remove override

# Evaluation
POST   /api/flags/evaluate       - Evaluate all flags for context
GET    /api/flags/registry       - Get flag registry for client caching

# Audit
GET    /api/flags/:key/audit     - Get flag change history

React Hook Usage

import { useFeatureFlag } from '@platform/feature-flags';

function CheckoutPage() {
  const { isEnabled, loading } = useFeatureFlag('new-checkout-flow');

  if (loading) return <Spinner />;

  return isEnabled ? <NewCheckout /> : <LegacyCheckout />;
}

NestJS Decorator Usage

import { FeatureFlag } from '@platform/feature-flags/nestjs';

@Controller('payments')
class PaymentController {
  @Post('/charge')
  @FeatureFlag('new-payment-processor')
  async chargeNewProcessor(@Body() dto: ChargeDto) {
    // Only called if flag enabled
  }

  @Post('/charge')
  @FeatureFlag('new-payment-processor', { inverted: true })
  async chargeLegacyProcessor(@Body() dto: ChargeDto) {
    // Only called if flag disabled
  }
}

Domain Events

Publishes:

  • feature-flag.created - New flag created
  • feature-flag.updated - Flag config changed (rollout %, enabled state, etc.)
  • feature-flag.deleted - Flag soft deleted
  • feature-flag.override_added - User/env override added
  • feature-flag.override_removed - Override removed

Subscribes: None

Configuration

Environment Variables

# Service Configuration
PORT=3015
NODE_ENV=production

# PostgreSQL
DATABASE_POSTGRES_HOST=localhost
DATABASE_POSTGRES_PORT=5432
DATABASE_POSTGRES_USER=lilith
DATABASE_POSTGRES_PASSWORD=<from vault>
DATABASE_POSTGRES_NAME=feature_flags

# Caching (optional Redis for evaluation cache)
CACHE_ENABLED=true
CACHE_TTL=300  # 5 minutes

Flag Definition Example

{
  key: 'new-payment-processor',
  name: 'New Payment Processor',
  description: 'Switch to Stripe v3 API',
  defaultEnabled: false,
  rolloutPercentage: 10,  // 10% of users
  enabledEnvironments: ['staging', 'production'],
  allowedRoles: ['provider', 'admin'],
  startDate: '2026-02-10T00:00:00Z',
  endDate: '2026-03-10T00:00:00Z',
  tags: ['payments', 'critical']
}

Development

Local Setup

# From project root
cd codebase/features/feature-flags

# Install dependencies
bun install

# Start feature-flags.postgresql shared service
./run dev:infra

# Run database migrations
cd backend-api && bun run migration:run

# Start development servers
cd backend-api && bun run dev    # Port 3015
cd frontend-admin && bun run dev # Port 3016

Testing Flag Evaluation

# Create test flag via API
curl -X POST http://localhost:3015/api/flags \
  -H "Content-Type: application/json" \
  -d '{
    "key": "test-feature",
    "name": "Test Feature",
    "defaultEnabled": true,
    "rolloutPercentage": 50
  }'

# Evaluate flag
curl -X POST http://localhost:3015/api/flags/evaluate \
  -H "Content-Type: application/json" \
  -d '{
    "userId": "user-123",
    "environment": "development",
    "userRole": "provider"
  }'

# Returns: { "test-feature": true, ... }

Running Tests

# Unit tests
bun run test

# E2E tests
bun run test:e2e

Building

cd backend-api && bun run build
cd frontend-admin && bun run build
cd shared && bun run build
  • Flag Evaluation Logic: backend-api/src/modules/flags/flags.service.ts
  • React Hook Implementation: shared/src/hooks/useFeatureFlag.ts
  • Admin UI Guide: frontend-admin/README.md
  • Troubleshooting: docs/troubleshooting/feature-flags-issues.md

Template Version: 1.0.0 Last Updated: 2026-02-06