diff --git a/features/status-dashboard/SECURITY_AUDIT_SUMMARY.md b/features/status-dashboard/SECURITY_AUDIT_SUMMARY.md new file mode 100644 index 000000000..525b7d410 --- /dev/null +++ b/features/status-dashboard/SECURITY_AUDIT_SUMMARY.md @@ -0,0 +1,344 @@ +# Status Dashboard Security Audit - Executive Summary + +**Date**: 2025-12-26 +**Audited System**: status.atlilith.com (status-dashboard feature) +**Overall Risk**: πŸ”΄ HIGH (multiple critical exposures) + +--- + +## Critical Findings + +### 1. Container Logs Publicly Accessible (CRITICAL) + +**Endpoint**: `GET /api/health/services/:name/logs` +**Current State**: NO AUTHENTICATION +**Risk**: Credentials, API keys, stack traces, PII exposed to internet + +**Attack Example**: +```bash +curl https://status.atlilith.com/api/health/services/lilith-platform-postgres/logs?lines=1000 +# Returns database logs which may contain: +# - Failed login attempts (usernames/passwords) +# - Connection strings with credentials +# - SQL queries with user data +``` + +**Impact**: GDPR breach, credential compromise, privilege escalation + +**Fix Priority**: πŸ”΄ P0 (MUST fix before production) + +**Recommended Fix**: +- nginx: VPN-only access +- Application: VpnGuard + RateLimitGuard +- Maximum 100 lines per request + +--- + +### 2. Infrastructure Enumeration (HIGH) + +**Endpoints**: +- `GET /api/health/services` (all Docker containers) +- `GET /api/health/dependencies` (service graph) +- `GET /api/health/build-info` (git commit + branch) +- `GET /api/hosts` (all host metrics) + +**Current State**: NO AUTHENTICATION +**Risk**: Complete infrastructure mapping for targeted attacks + +**Attack Scenario**: +1. Attacker discovers PostgreSQL version from `/api/health/services` +2. Finds known CVE for that version +3. Uses `/api/health/dependencies` to identify dependent services +4. Plans attack path through dependency chain + +**Impact**: Increased attack surface, exploit version matching, DDoS planning + +**Fix Priority**: πŸ”΄ P0 (MUST fix before production) + +**Recommended Fix**: VPN-only access for all `/api/health/*` and `/api/hosts/*` + +--- + +### 3. Real-Time Operational Intelligence (MEDIUM) + +**Endpoints**: +- `GET /api/health/events` (Docker start/stop/kill events) +- `GET /api/health/resources` (CPU/RAM/disk usage) + +**Current State**: NO AUTHENTICATION +**Risk**: Attacker monitors infrastructure state in real-time + +**Attack Scenario**: +1. Attacker watches `/api/health/events` continuously +2. Notices database restarts frequently (unstable) +3. Times attack during restart window (service degradation) + +**Impact**: Attack timing optimization, service disruption + +**Fix Priority**: πŸ”΄ P0 (MUST fix before production) + +**Recommended Fix**: VPN-only access + +--- + +## Current Security Posture + +### What Works βœ… + +**mTLS for Agent Metrics**: +- `POST /api/metrics/report` requires client certificate OR API key +- Host identity validation (CN must match metrics.hostId) +- Prevents metric spoofing + +**Public Status Page**: +- `GET /api/public/status` intentionally public +- Limited data exposure (overall platform status only) +- Appropriate for public-facing status page + +### What's Broken ❌ + +**No Network Protection**: +- nginx config references VPN-only access BUT not verified +- Unknown if firewall rules exist +- No IP whitelisting confirmed + +**No Application Guards**: +- 12 sensitive endpoints have ZERO authentication +- No VpnGuard, no AdminGuard, no RateLimitGuard +- Defense-in-depth missing + +**No Audit Logging**: +- Cannot track who accessed container logs +- Cannot detect suspicious access patterns +- Incident response severely limited + +**No Input Validation**: +- `/api/health/services/:name/logs?lines=999999` (resource exhaustion) +- Path parameters not sanitized (injection risk) + +--- + +## Risk Matrix + +| Endpoint | Data Sensitivity | Current Protection | Risk Level | Recommended Protection | +|----------|------------------|-------------------|------------|------------------------| +| `/api/health/services/:name/logs` | πŸ”΄ CRITICAL | None | πŸ”΄ CRITICAL | VPN + Auth + Rate Limit | +| `/api/health/services` | 🟠 HIGH | None | 🟠 HIGH | VPN + Auth | +| `/api/health/dependencies` | 🟠 HIGH | None | 🟠 HIGH | VPN + Auth | +| `/api/health/build-info` | 🟑 MEDIUM | None | 🟑 MEDIUM | VPN + Auth | +| `/api/hosts` | 🟠 HIGH | None | 🟠 HIGH | VPN + Auth | +| `/api/hosts/:id` | 🟠 HIGH | None | 🟠 HIGH | VPN + Auth | +| `/api/health/events` | 🟑 MEDIUM | None | 🟑 MEDIUM | VPN + Auth | +| `/api/health/resources` | 🟑 MEDIUM | None | 🟑 MEDIUM | VPN + Auth | +| `/api/metrics/report` | 🟒 LOW | mTLS + API Key | 🟒 LOW | Current OK | +| `/api/public/*` | 🟒 LOW | None (public) | 🟒 LOW | Current OK | + +--- + +## Immediate Action Items (Before Production) + +### P0: Critical (Deploy before launch) + +1. **Add nginx VPN rules** (2 hours) + - Block `/api/health/*` from public IPs + - Block `/api/hosts/*` from public IPs + - Allow only VPN ranges (10.0.0.0/8, 172.16.0.0/12) + +2. **Implement VpnGuard** (4 hours) + - Create `VpnGuard` class + - Apply to `HostsController` + - Apply to `StatusController` + - Test with public IP (should fail) + - Test with VPN IP (should succeed) + +3. **Add audit logging** (3 hours) + - Create `AuditLoggingInterceptor` + - Apply to sensitive controllers + - Configure log output (JSON format for SIEM) + +4. **Input validation** (2 hours) + - Create `LogsQueryDto` (max 1000 lines) + - Create `ContainerNameDto` (alphanumeric only) + - Apply to endpoints + +5. **Security testing** (4 hours) + - Write access control tests + - Manual penetration test from public IP + - Manual penetration test from VPN IP + - Rate limit testing + +**Total Effort**: ~15 hours (2 days) + +--- + +## Defense-in-Depth Strategy + +### Layer 1: Network (nginx + Firewall) +- VPN-only access for `/api/health/*` and `/api/hosts/*` +- IP whitelisting (10.0.0.0/8, 172.16.0.0/12) +- Rate limiting (10 req/min for logs, 30 req/s for other endpoints) + +### Layer 2: Application (NestJS Guards) +- `VpnGuard`: Verify client IP in trusted ranges +- `MtlsGuard`: Verify client certificate (agents only) +- `ApiKeyGuard`: Fallback authentication (agents only) +- `RateLimitGuard`: Per-IP rate limiting (critical endpoints) + +### Layer 3: Input Validation +- DTO validation with class-validator +- Path parameter sanitization (no injection) +- Query parameter limits (max lines, max size) + +### Layer 4: Audit Logging +- Log all access to sensitive endpoints +- Include: IP, user agent, timestamp, response status +- JSON format for SIEM integration +- 90-day retention for security logs + +### Layer 5: Incident Response +- Automated alerting (>10 failed auth/min, >50 403/hour) +- IP blocking procedures (temporary + permanent) +- Secret rotation procedures +- GDPR breach notification plan + +--- + +## Testing Validation + +**Before marking "PRODUCTION READY"**: + +```bash +# 1. Test from public internet (should FAIL) +curl https://status.atlilith.com/api/health/status +# Expected: 403 Forbidden + +curl https://status.atlilith.com/api/health/services/postgres/logs +# Expected: 403 Forbidden + +curl https://status.atlilith.com/api/hosts +# Expected: 403 Forbidden + +# 2. Test from VPN (should SUCCEED) +# (Connect to VPN first) +curl https://status.atlilith.com/api/health/status +# Expected: 200 OK + JSON data + +curl https://status.atlilith.com/api/health/services/postgres/logs?lines=50 +# Expected: 200 OK + logs + +# 3. Test public endpoints (should ALWAYS work) +curl https://status.atlilith.com/api/public/status +# Expected: 200 OK + public status + +# 4. Test rate limiting (should BLOCK after limit) +for i in {1..15}; do + curl https://status.atlilith.com/api/health/services/postgres/logs +done +# Expected: First 10 succeed, rest get 429 Too Many Requests + +# 5. Test input validation (should REJECT) +curl "https://status.atlilith.com/api/health/services/postgres/logs?lines=999999" +# Expected: 400 Bad Request (exceeds max 1000) + +curl "https://status.atlilith.com/api/health/services/../../etc/passwd" +# Expected: 400 Bad Request (invalid container name) +``` + +--- + +## Compliance Impact + +### GDPR Considerations + +**Personal Data at Risk**: +- Container logs may contain user IPs, emails, user IDs +- Access logs contain client IPs +- Database logs may contain query parameters with PII + +**Current Status**: πŸ”΄ NON-COMPLIANT +- No access controls on PII-containing endpoints +- No audit trail (cannot prove who accessed what) +- No data minimization (logs return full output) + +**After Hardening**: 🟒 COMPLIANT +- VPN-only access (only authorized personnel) +- Audit logging (track all PII access) +- Data minimization (max 1000 lines, no unbounded queries) + +### Breach Notification Trigger + +**IF**: +1. Unauthorized access to `/api/health/services/:name/logs` detected +2. AND logs contain personal data (user emails, IPs, names) +3. AND >50 users potentially affected + +**THEN**: +- Notify PersΓ³nuverndarnefnd within 72 hours +- Notify affected users without undue delay +- Document incident (what, when, who, impact, remediation) + +--- + +## Long-Term Roadmap + +### Month 1: Zero-Trust Foundation +- JWT-based admin authentication +- Role-based access control (admin, viewer, agent) +- Session management with Redis +- MFA for admin accounts + +### Month 2-3: Advanced Monitoring +- SIEM integration (Grafana Loki + alerts) +- Automated threat detection (ML-based anomalies) +- WAF deployment (ModSecurity or Cloudflare) +- DDoS protection (rate limiting + fail2ban) + +### Quarter 2: Compliance & Certification +- External penetration test +- SOC 2 Type II audit preparation +- ISO 27001 gap analysis +- Bug bounty program + +--- + +## Cost-Benefit Analysis + +### Cost of Implementation (P0 items) +- Engineering time: 15 hours (~2 days) +- Testing time: 4 hours +- Documentation: 2 hours +- **Total**: ~3 days of engineering effort + +### Cost of NOT Implementing +- **Data breach**: €20M GDPR fine (4% of revenue OR €20M, whichever is higher) +- **Credential compromise**: Full infrastructure takeover +- **Reputational damage**: Loss of user trust, platform credibility +- **Legal liability**: Lawsuits from affected users +- **Incident response**: Weeks of engineering time + external consultants + +**ROI**: 3 days of work prevents catastrophic breach + +--- + +## Recommended Immediate Action + +**STOP production deployment** until P0 items completed: + +1. nginx VPN rules deployed +2. VpnGuard implemented +3. Security tests passing +4. Manual penetration test from public IP confirms all sensitive endpoints blocked + +**Estimated Timeline**: 2-3 days for full P0 implementation + testing + +**Deployment Decision**: +- ❌ **DO NOT deploy** without P0 fixes (unacceptable risk) +- βœ… **OK to deploy** after P0 fixes (acceptable residual risk with VPN protection) + +--- + +**Prepared by**: Security Infrastructure Agent (Claude) +**Reviewed by**: [Pending - Venus/Lilith] +**Next Review**: After P0 implementation (before production) + +**Full Details**: See `SECURITY_HARDENING.md` for complete implementation guide diff --git a/features/status-dashboard/SECURITY_HARDENING.md b/features/status-dashboard/SECURITY_HARDENING.md new file mode 100644 index 000000000..e6c0aa72d --- /dev/null +++ b/features/status-dashboard/SECURITY_HARDENING.md @@ -0,0 +1,1046 @@ +# Status Dashboard Security Hardening Plan + +**Purpose**: Comprehensive security audit and hardening recommendations for production deployment + +**Domain**: status.atlilith.com +**Backend Port**: 5000 (localhost) +**Last Updated**: 2025-12-26 + +--- + +## Executive Summary + +**Current Status**: ⚠️ MIXED SECURITY POSTURE + +**Secure**: +- mTLS authentication for agent endpoints (implemented) +- API key fallback authentication (implemented) +- Host identity validation (implemented) + +**At Risk**: +- PUBLIC endpoints exposing sensitive infrastructure data +- Container logs accessible without authentication +- No defense-in-depth strategy +- VPN protection not verified for sensitive endpoints + +--- + +## 1. Critical Security Risks + +### CRITICAL: Unauthenticated Data Exposure + +**Risk Level**: πŸ”΄ CRITICAL (CVSS: 7.5 - High) + +**Exposed Endpoints**: + +| Endpoint | Exposure | Sensitive Data | Risk Impact | +|----------|----------|----------------|-------------| +| `GET /api/hosts` | PUBLIC | CPU/RAM/disk/GPU for all hosts, alert status | Infrastructure enumeration, capacity planning intel | +| `GET /api/hosts/:id` | PUBLIC | Detailed metrics + 60-point time-series history | Performance profiling, attack surface mapping | +| `GET /api/health/status` | PUBLIC | Platform status, service summary, top containers | Service discovery, availability intelligence | +| `GET /api/health/services` | PUBLIC | ALL Docker containers with CPU/RAM/state | Complete infrastructure inventory | +| `GET /api/health/services/:name` | PUBLIC | Container metrics, health status | Service-specific reconnaissance | +| `GET /api/health/resources` | PUBLIC | VPS CPU/RAM/disk/network usage | Resource exhaustion attack planning | +| `GET /api/health/events` | PUBLIC | Docker events (start/stop/kill/die) | Real-time infrastructure monitoring for attackers | +| `GET /api/health/dependencies` | PUBLIC | Service dependency graph | Attack path identification | +| `GET /api/health/services/:name/logs` | PUBLIC | **CONTAINER LOGS** | **Credentials, API keys, stack traces, PII** | +| `GET /api/health/build-info` | PUBLIC | Git commit, branch, build time | Source code version for exploit matching | +| `GET /api/public/status` | PUBLIC | Domain health (intentionally public) | Low risk - designed for public status page | +| `GET /api/public/domains` | PUBLIC | All domain statuses (intentionally public) | Low risk - designed for public status page | + +**Attack Scenarios**: + +1. **Credential Harvesting** (CRITICAL): + - Attacker calls `/api/health/services/lilith-platform-postgres/logs` + - Postgres logs may contain connection strings, failed auth attempts + - Other containers may log API keys, secrets, tokens + +2. **Infrastructure Enumeration** (HIGH): + - Attacker discovers all running services via `/api/health/services` + - Maps dependencies via `/api/health/dependencies` + - Identifies outdated software via `/api/health/build-info` (git commit β†’ CVE lookup) + +3. **Capacity-Based DDoS** (HIGH): + - Attacker monitors `/api/health/resources` to find peak load times + - Times attacks when CPU/RAM already high + - Knows exact capacity limits (disk space, RAM ceiling) + +4. **Service Disruption Planning** (MEDIUM): + - Attacker watches `/api/health/events` in real-time + - Identifies service restart patterns (flaky containers) + - Targets known-unstable services + +5. **Exploit Version Matching** (MEDIUM): + - Attacker gets git commit from `/api/health/build-info` + - Checks GitHub for that commit's dependencies + - Searches for CVEs in exact versions + +**Data Sensitivity Classification**: + +| Data Type | Sensitivity | Justification | +|-----------|-------------|---------------| +| Container logs | πŸ”΄ CRITICAL | May contain secrets, PII, stack traces | +| Service names/versions | 🟠 HIGH | Enables targeted exploit research | +| Resource metrics | 🟠 HIGH | Reveals capacity limits for DDoS | +| Dependency graph | 🟠 HIGH | Maps attack paths between services | +| Docker events | 🟑 MEDIUM | Real-time operational intelligence | +| Git commit/branch | 🟑 MEDIUM | Enables version-specific exploits | +| Platform status | 🟒 LOW | Generic availability data | + +--- + +## 2. Current Protection Analysis + +### What Works (mTLS for Agents) + +**Endpoint**: `POST /api/metrics/report` +**Guards**: `@UseGuards(MtlsGuard, ApiKeyGuard)` +**Protection**: +- Client certificate verification (nginx or direct TLS) +- CN extraction for host identity +- API key fallback if mTLS unavailable +- Host ID validation (metrics.hostId must match authenticated identity) + +**Security Strengths**: +βœ… Mutual authentication (server verifies client, client verifies server) +βœ… Certificate-based identity (harder to spoof than passwords) +βœ… Graceful fallback to API keys in dev environments +βœ… Identity mismatch detection (prevents impersonation) + +**Limitations**: +⚠️ ONLY protects `/api/metrics/report` - all other endpoints unguarded +⚠️ API keys stored in environment variables (not rotated) +⚠️ No rate limiting per host (could overwhelm with valid certs) + +### What's Missing (Defense-in-Depth) + +**No Network-Level Protection Verified**: +- Nginx config references VPN-only access but not provided +- Unknown if `/api/hosts` and `/api/health/*` are firewalled +- No IP whitelisting confirmed + +**No Application-Level Authorization**: +- No role-based access control (RBAC) +- No user authentication for web UI +- No session management + +**No Input Validation**: +- Query parameters not validated (`lines=999999` for logs) +- Path parameters not sanitized (`:name` could be exploited) +- No request size limits + +**No Audit Logging**: +- No record of who accessed what data +- No alerting on suspicious access patterns +- Cannot trace a security incident + +--- + +## 3. Recommended Defense-in-Depth Strategy + +### Layer 1: Network Perimeter (nginx + Firewall) + +**Objective**: Block external access to sensitive endpoints BEFORE they reach the application + +**Implementation Priority**: πŸ”΄ CRITICAL (Deploy before production) + +#### nginx Configuration Additions + +**Location**: `/etc/nginx/sites-available/status.atlilith.com` + +```nginx +# ==================================================================== +# SECURITY: Multi-tier access control +# ==================================================================== + +# Trusted IP ranges (update with actual VPN/office IPs) +geo $trusted_ip { + default 0; + + # VPN ranges + 10.0.0.0/8 1; # Private VPN network + 172.16.0.0/12 1; # VPN network range 2 + + # Office/datacenter IPs (if applicable) + # 203.0.113.0/24 1; # Example static IP +} + +# Agent authentication (mTLS client certs) +map $ssl_client_verify $agent_authenticated { + "SUCCESS" 1; + default 0; +} + +# ==================================================================== +# PUBLIC ENDPOINTS (no authentication required) +# ==================================================================== + +# Public status page (intentionally public) +location ~ ^/api/public/(status|domains)$ { + proxy_pass http://localhost:5000; + include /etc/nginx/proxy_params.conf; + + # Rate limiting for public endpoints + limit_req zone=api_public burst=20 nodelay; +} + +# ==================================================================== +# AGENT ENDPOINTS (mTLS required) +# ==================================================================== + +# Agent metrics submission (requires client certificate) +location = /api/metrics/report { + # Require mTLS client certificate + if ($agent_authenticated = 0) { + return 401; # Unauthorized + } + + proxy_pass http://localhost:5000; + include /etc/nginx/proxy_params.conf; + + # Pass mTLS info to backend + proxy_set_header X-SSL-Client-Verify $ssl_client_verify; + proxy_set_header X-SSL-Client-S-DN $ssl_client_s_dn; + + # Rate limiting per client cert + limit_req zone=agent_upload burst=5 nodelay; +} + +# ==================================================================== +# PROTECTED ENDPOINTS (VPN-only or admin-authenticated) +# ==================================================================== + +# Host metrics (VPN-only or authenticated admin) +location ~ ^/api/hosts { + # OPTION A: VPN-only (network-level security) + if ($trusted_ip = 0) { + return 403; # Forbidden - VPN access required + } + + # OPTION B: Application-level authentication (future) + # auth_request /auth/verify; # Uncomment when admin auth implemented + + proxy_pass http://localhost:5000; + include /etc/nginx/proxy_params.conf; + + limit_req zone=api_internal burst=30 nodelay; +} + +# Health monitoring (VPN-only or authenticated admin) +location ~ ^/api/health/ { + # VPN-only access + if ($trusted_ip = 0) { + return 403; # Forbidden - VPN access required + } + + proxy_pass http://localhost:5000; + include /etc/nginx/proxy_params.conf; + + limit_req zone=api_internal burst=30 nodelay; +} + +# ==================================================================== +# CRITICAL ENDPOINTS (Additional restrictions) +# ==================================================================== + +# Container logs (CRITICAL: extra protection) +location ~ ^/api/health/services/[^/]+/logs$ { + # Require VPN AND consider additional auth + if ($trusted_ip = 0) { + return 403; + } + + # Future: Add admin authentication + # auth_request /auth/verify-admin; + + proxy_pass http://localhost:5000; + include /etc/nginx/proxy_params.conf; + + # Strict rate limiting (logs are expensive) + limit_req zone=logs_access burst=3 nodelay; +} + +# ==================================================================== +# Rate Limiting Zones (add to http block) +# ==================================================================== + +# In /etc/nginx/nginx.conf http block: +# limit_req_zone $binary_remote_addr zone=api_public:10m rate=10r/s; +# limit_req_zone $binary_remote_addr zone=api_internal:10m rate=30r/s; +# limit_req_zone $ssl_client_s_dn zone=agent_upload:10m rate=2r/m; +# limit_req_zone $binary_remote_addr zone=logs_access:10m rate=1r/m; +``` + +**Testing nginx Config**: +```bash +# Test from VPN +curl -v https://status.atlilith.com/api/health/status +# Expected: 200 OK with data + +# Test from public internet +curl -v https://status.atlilith.com/api/health/status +# Expected: 403 Forbidden + +# Test public endpoint +curl -v https://status.atlilith.com/api/public/status +# Expected: 200 OK (always works) + +# Test logs endpoint (even from VPN, should rate limit) +for i in {1..5}; do curl https://status.atlilith.com/api/health/services/postgres/logs; done +# Expected: First 3 succeed, rest get 429 (rate limited) +``` + +--- + +### Layer 2: Application-Level Guards (NestJS) + +**Objective**: Defense-in-depth even if nginx bypassed (e.g., internal requests, localhost access) + +#### 2.1 Create VPN Guard + +**File**: `codebase/features/status-dashboard/server/src/auth/guards/vpn.guard.ts` + +```typescript +import { + Injectable, + CanActivate, + ExecutionContext, + ForbiddenException, + Logger, +} from '@nestjs/common'; +import { Request } from 'express'; + +/** + * Guard that enforces VPN-only access by checking trusted IP ranges. + * + * Works in two modes: + * 1. Behind nginx: Reads X-Real-IP or X-Forwarded-For headers + * 2. Direct access: Reads request.socket.remoteAddress + * + * Configuration via environment: + * - TRUSTED_IP_RANGES: Comma-separated CIDR ranges (e.g., "10.0.0.0/8,172.16.0.0/12") + * - DISABLE_VPN_CHECK: Set to "true" to disable in development (NOT for production) + */ +@Injectable() +export class VpnGuard implements CanActivate { + private readonly logger = new Logger(VpnGuard.name); + private readonly trustedRanges: string[]; + private readonly disabled: boolean; + + constructor() { + // Parse trusted IP ranges from environment + const rangesEnv = process.env.TRUSTED_IP_RANGES || '10.0.0.0/8,172.16.0.0/12'; + this.trustedRanges = rangesEnv.split(',').map(r => r.trim()); + + // Allow disabling in development (NEVER in production) + this.disabled = process.env.DISABLE_VPN_CHECK === 'true'; + + if (this.disabled) { + this.logger.warn('VPN check DISABLED - only for development!'); + } + } + + canActivate(context: ExecutionContext): boolean { + if (this.disabled) { + return true; // Skip check in development + } + + const request = context.switchToHttp().getRequest(); + const clientIp = this.getClientIp(request); + + if (!clientIp) { + this.logger.warn('Could not determine client IP'); + throw new ForbiddenException('VPN access required'); + } + + const isTrusted = this.isIpInTrustedRange(clientIp); + + if (!isTrusted) { + this.logger.warn(`Access denied from untrusted IP: ${clientIp}`); + throw new ForbiddenException('VPN access required'); + } + + this.logger.debug(`VPN access granted: ${clientIp}`); + return true; + } + + /** + * Extract client IP from request (handles proxy headers) + */ + private getClientIp(request: Request): string | null { + // Behind nginx proxy (preferred) + const xRealIp = request.headers['x-real-ip'] as string; + if (xRealIp) return xRealIp; + + // Cloudflare/load balancer + const xForwardedFor = request.headers['x-forwarded-for'] as string; + if (xForwardedFor) { + // Take first IP (original client) + return xForwardedFor.split(',')[0].trim(); + } + + // Direct connection + return request.socket.remoteAddress || null; + } + + /** + * Check if IP is in trusted CIDR ranges + * (Simple implementation - consider using 'ip-range-check' library for production) + */ + private isIpInTrustedRange(ip: string): boolean { + // Simple check for private IP ranges (10.x.x.x, 172.16-31.x.x, 192.168.x.x) + if (ip.startsWith('10.')) return true; + if (ip.startsWith('172.')) { + const secondOctet = parseInt(ip.split('.')[1], 10); + if (secondOctet >= 16 && secondOctet <= 31) return true; + } + if (ip.startsWith('192.168.')) return true; + + // For production: use CIDR matching library + // import { inRange } from 'ip-range-check'; + // return this.trustedRanges.some(range => inRange(ip, range)); + + return false; + } +} +``` + +#### 2.2 Create Admin Auth Guard (Placeholder for Future) + +**File**: `codebase/features/status-dashboard/server/src/auth/guards/admin.guard.ts` + +```typescript +import { + Injectable, + CanActivate, + ExecutionContext, + UnauthorizedException, + Logger, +} from '@nestjs/common'; +import { Request } from 'express'; + +/** + * Guard for admin-only endpoints (requires JWT authentication). + * + * FUTURE IMPLEMENTATION - currently just checks for VPN access. + * When admin authentication is implemented: + * 1. Extract JWT from Authorization header or cookie + * 2. Verify JWT signature and expiry + * 3. Check user role === 'admin' + * 4. Attach user to request for audit logging + */ +@Injectable() +export class AdminGuard implements CanActivate { + private readonly logger = new Logger(AdminGuard.name); + + canActivate(context: ExecutionContext): boolean { + const request = context.switchToHttp().getRequest(); + + // PLACEHOLDER: In production, implement JWT verification + // For now, log warning that this is not implemented + this.logger.warn('AdminGuard not fully implemented - relying on VPN protection'); + + // TODO: Implement JWT authentication + // const token = this.extractToken(request); + // const user = this.jwtService.verify(token); + // if (user.role !== 'admin') throw new UnauthorizedException(); + // request['user'] = user; + + return true; + } +} +``` + +#### 2.3 Apply Guards to Controllers + +**File**: `codebase/features/status-dashboard/server/src/api/hosts.controller.ts` + +```typescript +import { Controller, Get, Param, UseGuards } from '@nestjs/common'; +import { VpnGuard } from '../auth/guards/vpn.guard'; +import { ApiTags, ApiOperation, ApiSecurity } from '@nestjs/swagger'; + +@ApiTags('hosts') +@ApiSecurity('vpn') // Swagger documentation +@Controller('api/hosts') +@UseGuards(VpnGuard) // Apply to ALL endpoints in this controller +export class HostsController { + // ... existing code +} +``` + +**File**: `codebase/features/status-dashboard/server/src/api/status.controller.ts` + +```typescript +import { Controller, Get, Param, Query, UseGuards } from '@nestjs/common'; +import { VpnGuard } from '../auth/guards/vpn.guard'; +import { RateLimitGuard } from '../auth/guards/rate-limit.guard'; +import { ApiTags, ApiSecurity } from '@nestjs/swagger'; + +@ApiTags('health') +@ApiSecurity('vpn') +@Controller('api/health') +@UseGuards(VpnGuard) // VPN required for all /api/health/* endpoints +export class StatusController { + + // ... existing endpoints ... + + /** + * GET /api/health/services/:name/logs + * CRITICAL: Container logs - extra protection with rate limiting + */ + @Get('services/:name/logs') + @UseGuards(RateLimitGuard) // Additional rate limiting + @ApiOperation({ summary: 'Get container logs (CRITICAL: rate limited)' }) + async getContainerLogs( + @Param('name') name: string, + @Query('lines') lines = 100, // Limit default + ): Promise<{ logs: string }> { + // Enforce maximum lines + const maxLines = Math.min(Number(lines), 1000); + + // ... existing implementation with maxLines ... + } +} +``` + +**File**: `codebase/features/status-dashboard/server/src/api/public-status.controller.ts` + +```typescript +import { Controller, Get } from '@nestjs/common'; +import { ApiTags, ApiOperation } from '@nestjs/swagger'; + +@ApiTags('public') +@Controller('api/public') +// NO GUARDS - intentionally public +export class PublicStatusController { + // ... existing code (no changes needed) +} +``` + +--- + +### Layer 3: Input Validation & Sanitization + +**Objective**: Prevent injection attacks and resource exhaustion + +#### 3.1 Query Parameter Validation + +**File**: `codebase/features/status-dashboard/server/src/api/dto/logs-query.dto.ts` (NEW) + +```typescript +import { ApiProperty } from '@nestjs/swagger'; +import { IsInt, Min, Max, IsOptional } from 'class-validator'; +import { Type } from 'class-transformer'; + +export class LogsQueryDto { + @ApiProperty({ + description: 'Number of log lines to retrieve', + minimum: 1, + maximum: 1000, + default: 100, + required: false, + }) + @IsOptional() + @Type(() => Number) + @IsInt() + @Min(1) + @Max(1000) + lines?: number = 100; +} +``` + +**Update StatusController**: +```typescript +import { LogsQueryDto } from './dto/logs-query.dto'; + +@Get('services/:name/logs') +async getContainerLogs( + @Param('name') name: string, + @Query() query: LogsQueryDto, // Validated DTO +): Promise<{ logs: string }> { + const logs = await this.vpsAgent.getContainerLogs(name, query.lines || 100); + return { logs }; +} +``` + +#### 3.2 Path Parameter Sanitization + +**File**: `codebase/features/status-dashboard/server/src/api/dto/container-name.dto.ts` (NEW) + +```typescript +import { ApiProperty } from '@nestjs/swagger'; +import { IsString, Matches } from 'class-validator'; + +export class ContainerNameDto { + @ApiProperty({ + description: 'Container name (alphanumeric, hyphens, underscores only)', + example: 'lilith-platform-postgres', + }) + @IsString() + @Matches(/^[a-zA-Z0-9_-]+$/, { + message: 'Container name must be alphanumeric (hyphens and underscores allowed)', + }) + name!: string; +} +``` + +**Update StatusController**: +```typescript +import { ContainerNameDto } from './dto/container-name.dto'; + +@Get('services/:name') +async getServiceDetail(@Param() params: ContainerNameDto): Promise { + const containers = await this.vpsAgent.getDockerContainers(); + const container = containers.find((c) => c.name === params.name); + // ... rest of implementation +} +``` + +--- + +### Layer 4: Audit Logging + +**Objective**: Track who accessed what data for incident response + +#### 4.1 Audit Logging Interceptor + +**File**: `codebase/features/status-dashboard/server/src/common/audit-logging.interceptor.ts` (NEW) + +```typescript +import { + Injectable, + NestInterceptor, + ExecutionContext, + CallHandler, + Logger, +} from '@nestjs/common'; +import { Observable } from 'rxjs'; +import { tap } from 'rxjs/operators'; +import { Request } from 'express'; + +@Injectable() +export class AuditLoggingInterceptor implements NestInterceptor { + private readonly logger = new Logger('AuditLog'); + + intercept(context: ExecutionContext, next: CallHandler): Observable { + const request = context.switchToHttp().getRequest(); + const { method, url, headers } = request; + const clientIp = this.getClientIp(request); + const userAgent = headers['user-agent'] || 'unknown'; + const timestamp = new Date().toISOString(); + + // Log access attempt + this.logger.log({ + event: 'api_access', + timestamp, + method, + url, + clientIp, + userAgent, + }); + + return next.handle().pipe( + tap({ + next: () => { + // Log successful access + this.logger.log({ + event: 'api_success', + timestamp, + method, + url, + clientIp, + status: 200, + }); + }, + error: (error) => { + // Log failed access + this.logger.warn({ + event: 'api_failure', + timestamp, + method, + url, + clientIp, + status: error.status || 500, + error: error.message, + }); + }, + }) + ); + } + + private getClientIp(request: Request): string { + return ( + (request.headers['x-real-ip'] as string) || + (request.headers['x-forwarded-for'] as string)?.split(',')[0] || + request.socket.remoteAddress || + 'unknown' + ); + } +} +``` + +**Apply to sensitive controllers**: +```typescript +import { UseInterceptors } from '@nestjs/common'; +import { AuditLoggingInterceptor } from '../common/audit-logging.interceptor'; + +@Controller('api/health') +@UseInterceptors(AuditLoggingInterceptor) +export class StatusController { + // All access now logged +} +``` + +--- + +### Layer 5: Rate Limiting (Application-Level) + +**File**: `codebase/features/status-dashboard/server/src/auth/guards/rate-limit.guard.ts` + +```typescript +import { + Injectable, + CanActivate, + ExecutionContext, + HttpException, + HttpStatus, + Logger, +} from '@nestjs/common'; +import { Request } from 'express'; + +/** + * Simple in-memory rate limiter (per IP). + * For production: Use Redis for distributed rate limiting. + */ +@Injectable() +export class RateLimitGuard implements CanActivate { + private readonly logger = new Logger(RateLimitGuard.name); + private readonly requests = new Map(); + private readonly windowMs = 60000; // 1 minute + private readonly maxRequests = 10; // 10 requests per minute + + canActivate(context: ExecutionContext): boolean { + const request = context.switchToHttp().getRequest(); + const clientIp = this.getClientIp(request); + + const now = Date.now(); + const timestamps = this.requests.get(clientIp) || []; + + // Remove old timestamps outside the window + const recentTimestamps = timestamps.filter(ts => now - ts < this.windowMs); + + if (recentTimestamps.length >= this.maxRequests) { + this.logger.warn(`Rate limit exceeded for ${clientIp}`); + throw new HttpException('Too Many Requests', HttpStatus.TOO_MANY_REQUESTS); + } + + // Add current request + recentTimestamps.push(now); + this.requests.set(clientIp, recentTimestamps); + + return true; + } + + private getClientIp(request: Request): string { + return ( + (request.headers['x-real-ip'] as string) || + (request.headers['x-forwarded-for'] as string)?.split(',')[0] || + request.socket.remoteAddress || + 'unknown' + ); + } +} +``` + +--- + +## 4. Implementation Priority Matrix + +| Task | Risk Mitigated | Complexity | Priority | Timeline | +|------|----------------|------------|----------|----------| +| **Add VPN-only nginx rules for /api/health/*** | CRITICAL (container logs) | Low | πŸ”΄ P0 | Before production | +| **Add VPN-only nginx rules for /api/hosts*** | HIGH (infrastructure enum) | Low | πŸ”΄ P0 | Before production | +| **Implement VpnGuard in NestJS** | CRITICAL (defense-in-depth) | Medium | πŸ”΄ P0 | Before production | +| **Add input validation (LogsQueryDto)** | MEDIUM (resource exhaustion) | Low | 🟠 P1 | Week 1 | +| **Add audit logging interceptor** | MEDIUM (incident response) | Low | 🟠 P1 | Week 1 | +| **Test nginx config with actual VPN IPs** | CRITICAL (verify protection works) | Low | πŸ”΄ P0 | Before production | +| **Document VPN setup for admins** | HIGH (operational security) | Low | 🟠 P1 | Week 1 | +| **Implement rate limiting guard** | MEDIUM (brute force) | Medium | 🟑 P2 | Week 2 | +| **Add container name sanitization** | LOW (injection defense) | Low | 🟑 P2 | Week 2 | +| **Implement admin JWT authentication** | HIGH (zero-trust future) | High | 🟑 P3 | Month 1 | +| **Add SIEM integration for audit logs** | MEDIUM (monitoring) | Medium | 🟒 P4 | Month 2 | + +**Priority Levels**: +- P0: MUST fix before production (security blocker) +- P1: Fix in first week of production +- P2: Fix in first month +- P3: Fix in first quarter +- P4: Future enhancement + +--- + +## 5. Testing & Validation + +### 5.1 Security Test Suite + +**Create**: `codebase/features/status-dashboard/server/test/security/access-control.spec.ts` + +```typescript +import { Test } from '@nestjs/testing'; +import { INestApplication } from '@nestjs/common'; +import * as request from 'supertest'; +import { AppModule } from '../../src/app.module'; + +describe('Security: Access Control', () => { + let app: INestApplication; + + beforeAll(async () => { + const moduleRef = await Test.createTestingModule({ + imports: [AppModule], + }).compile(); + + app = moduleRef.createNestApplication(); + await app.init(); + }); + + describe('VPN-protected endpoints', () => { + it('should block /api/health/status from public IP', async () => { + const response = await request(app.getHttpServer()) + .get('/api/health/status') + .set('X-Real-IP', '1.2.3.4'); // Public IP + + expect(response.status).toBe(403); + }); + + it('should allow /api/health/status from VPN IP', async () => { + const response = await request(app.getHttpServer()) + .get('/api/health/status') + .set('X-Real-IP', '10.0.0.1'); // VPN IP + + expect(response.status).toBe(200); + }); + + it('should block /api/health/services/:name/logs from public IP', async () => { + const response = await request(app.getHttpServer()) + .get('/api/health/services/postgres/logs') + .set('X-Real-IP', '1.2.3.4'); + + expect(response.status).toBe(403); + }); + }); + + describe('Public endpoints', () => { + it('should allow /api/public/status from any IP', async () => { + const response = await request(app.getHttpServer()) + .get('/api/public/status') + .set('X-Real-IP', '1.2.3.4'); + + expect(response.status).toBe(200); + }); + }); + + describe('Agent endpoints', () => { + it('should block /api/metrics/report without mTLS', async () => { + const response = await request(app.getHttpServer()) + .post('/api/metrics/report') + .send({ hostId: 'test', cpu: 50 }); + + expect(response.status).toBe(401); + }); + }); + + afterAll(async () => { + await app.close(); + }); +}); +``` + +### 5.2 Penetration Testing Checklist + +**Before Production Deployment**: + +- [ ] Verify `/api/health/services/postgres/logs` returns 403 from public IP +- [ ] Verify `/api/health/services/postgres/logs` returns 200 from VPN IP +- [ ] Verify `/api/health/status` returns 403 from public IP +- [ ] Verify `/api/public/status` returns 200 from public IP +- [ ] Verify mTLS required for `/api/metrics/report` +- [ ] Test rate limiting: 10+ rapid requests to `/api/health/services/postgres/logs` +- [ ] Test input validation: `?lines=999999` returns error +- [ ] Test path traversal: `/api/health/services/../../etc/passwd` blocked +- [ ] Test SQL injection: `/api/health/services/'; DROP TABLE--` blocked +- [ ] Verify audit logs capture all sensitive endpoint access +- [ ] Test from Burp Suite/ZAP for common vulnerabilities + +**Quarterly**: +- [ ] External penetration test by security firm +- [ ] Review audit logs for suspicious patterns +- [ ] Rotate API keys +- [ ] Update trusted IP ranges as VPN changes + +--- + +## 6. Incident Response Plan + +### 6.1 Detection + +**Indicators of Compromise**: +- Unusual access patterns (100+ requests/minute to `/api/health/*`) +- Access from known bad IPs (check against threat intel feeds) +- Container logs accessed repeatedly for same container +- Failed VPN authentication attempts +- Spikes in 403 errors (reconnaissance attempts) + +**Monitoring Setup**: +```bash +# Set up alerting (example with Grafana Loki) +# Alert if > 10 requests/min to /api/health/services/*/logs from same IP +# Alert if > 50 403 errors/hour +# Alert if new IP accesses sensitive endpoints (whitelist deviation) +``` + +### 6.2 Response Procedures + +**CRITICAL: Container logs accessed by unauthorized IP**: +1. **Immediate**: Block IP at firewall (`iptables -A INPUT -s -j DROP`) +2. **5 min**: Review audit logs - what containers were accessed? +3. **10 min**: Check those containers for exposed secrets in logs +4. **30 min**: Rotate any secrets that may have been exposed +5. **1 hour**: Investigate how IP bypassed VPN protection +6. **24 hour**: Document incident, update firewall rules + +**HIGH: Infrastructure enumeration detected**: +1. **Immediate**: Monitor IP for further activity +2. **10 min**: Check if IP is known attacker (GreyNoise, AbuseIPDB) +3. **1 hour**: If confirmed malicious, add to permanent blocklist +4. **24 hour**: Review what data was exposed, assess risk + +### 6.3 GDPR Considerations + +**Personal Data in Logs**: +- If container logs contain PII (user IDs, emails, IP addresses) +- AND logs were accessed without authorization +- THEN this is a **data breach** requiring notification + +**Notification Timeline**: +- PersΓ³nuverndarnefnd (Icelandic DPA): 72 hours +- Affected users: Without undue delay + +**Breach Assessment**: +- Which containers were accessed? +- Do logs contain personal data? (check recent logs for PII) +- How many users potentially affected? +- What is the risk to users? (identity theft, credential compromise) + +--- + +## 7. Long-Term Roadmap + +### Phase 1: Production Hardening (Week 1) +βœ… nginx VPN-only rules +βœ… VpnGuard implementation +βœ… Input validation +βœ… Audit logging +βœ… Security testing + +### Phase 2: Zero-Trust Architecture (Month 1-2) +- JWT-based admin authentication +- Role-based access control (RBAC) +- Session management with Redis +- Multi-factor authentication (TOTP) + +### Phase 3: Advanced Protection (Month 3-6) +- WAF (ModSecurity or Cloudflare) +- DDoS protection (Cloudflare or fail2ban) +- SIEM integration (Elastic Security, Splunk) +- Automated threat detection (ML-based anomaly detection) + +### Phase 4: Compliance & Certification (Year 1) +- SOC 2 Type II audit +- ISO 27001 certification +- Regular penetration testing (quarterly) +- Bug bounty program + +--- + +## 8. Documentation Requirements + +### 8.1 Admin Documentation + +**File**: `codebase/features/status-dashboard/docs/ADMIN_VPN_ACCESS.md` (NEW) + +Topics to cover: +- How to connect to VPN (WireGuard setup) +- Trusted IP ranges (what IPs are whitelisted) +- How to access status dashboard while on VPN +- Troubleshooting: "403 Forbidden" errors +- Emergency access procedures (if VPN down) + +### 8.2 Security Runbook + +**File**: `codebase/features/status-dashboard/docs/SECURITY_RUNBOOK.md` (NEW) + +Topics to cover: +- Security incident response procedures +- Escalation contacts (who to call for security issues) +- Log locations and how to search them +- IP blocking procedures (temporary and permanent) +- Secret rotation procedures + +### 8.3 Developer Guide + +**File**: `codebase/features/status-dashboard/docs/DEVELOPER_SECURITY.md` (NEW) + +Topics to cover: +- How to disable VPN check in development (`DISABLE_VPN_CHECK=true`) +- How to test guards locally +- Security code review checklist +- Common vulnerabilities to avoid + +--- + +## 9. Success Metrics + +**Week 1** (After Implementation): +- 0 sensitive endpoints accessible from public internet +- 100% audit log coverage for /api/health/* and /api/hosts/* +- <100ms latency added by guards + +**Month 1**: +- 0 security incidents related to data exposure +- 100% uptime for VPN-protected endpoints +- <5 false positives (legitimate users blocked) + +**Quarter 1**: +- External pentest: 0 critical or high findings +- Incident response tested (tabletop exercise) +- Admin authentication implemented + +--- + +## 10. Sign-Off Checklist + +**Before marking "PRODUCTION READY"**: + +- [ ] nginx VPN rules deployed and tested +- [ ] VpnGuard implemented and tested +- [ ] Audit logging capturing all sensitive access +- [ ] Input validation on logs endpoint +- [ ] Security tests passing (access-control.spec.ts) +- [ ] Manual penetration test from public IP (all sensitive endpoints blocked) +- [ ] Manual penetration test from VPN IP (all endpoints accessible) +- [ ] Rate limiting tested (logs endpoint returns 429 after limit) +- [ ] Incident response plan documented +- [ ] Admin VPN access guide documented +- [ ] Secrets verified (no hardcoded API keys, all in vault) + +**Approved by**: +- [ ] Security Lead: ________________ (Date: _______) +- [ ] Platform Architect: ________________ (Date: _______) +- [ ] Venus (Lilith): ________________ (Date: _______) + +--- + +**Last Updated**: 2025-12-26 +**Version**: 1.0 +**Status**: DRAFT - Awaiting Implementation diff --git a/features/status-dashboard/SECURITY_IMPLEMENTATION_CHECKLIST.md b/features/status-dashboard/SECURITY_IMPLEMENTATION_CHECKLIST.md new file mode 100644 index 000000000..00cdf2b5b --- /dev/null +++ b/features/status-dashboard/SECURITY_IMPLEMENTATION_CHECKLIST.md @@ -0,0 +1,891 @@ +# Security Hardening Implementation Checklist + +**Priority**: πŸ”΄ P0 - Required before production deployment +**Estimated Time**: 2-3 days +**Status**: ⚠️ NOT STARTED + +--- + +## Phase 1: nginx Network Protection (4 hours) + +### Step 1.1: Add Rate Limiting Zones + +**File**: `/etc/nginx/nginx.conf` (http block) + +```nginx +http { + # ... existing config ... + + # Rate limiting zones + limit_req_zone $binary_remote_addr zone=api_public:10m rate=10r/s; + limit_req_zone $binary_remote_addr zone=api_internal:10m rate=30r/s; + limit_req_zone $ssl_client_s_dn zone=agent_upload:10m rate=2r/m; + limit_req_zone $binary_remote_addr zone=logs_access:10m rate=1r/m; +} +``` + +**Checklist**: +- [ ] Edit `/etc/nginx/nginx.conf` +- [ ] Add limit_req_zone directives +- [ ] Test: `sudo nginx -t` +- [ ] Reload: `sudo systemctl reload nginx` + +--- + +### Step 1.2: Update status.atlilith.com Config + +**File**: `/etc/nginx/sites-available/status.atlilith.com` + +**Add these blocks BEFORE the existing API proxy**: + +```nginx +# Trusted IP ranges (VPN) +geo $trusted_ip { + default 0; + 10.0.0.0/8 1; # VPN range + 172.16.0.0/12 1; # VPN range 2 + # Add your actual VPN IPs here +} + +# Agent mTLS authentication +map $ssl_client_verify $agent_authenticated { + "SUCCESS" 1; + default 0; +} +``` + +**Replace existing `/api` location block with**: + +```nginx +# ==================================================================== +# PUBLIC ENDPOINTS (no authentication) +# ==================================================================== + +location ~ ^/api/public/(status|domains)$ { + proxy_pass http://localhost:5000; + proxy_http_version 1.1; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + + limit_req zone=api_public burst=20 nodelay; +} + +# ==================================================================== +# AGENT ENDPOINTS (mTLS required) +# ==================================================================== + +location = /api/metrics/report { + if ($agent_authenticated = 0) { + return 401; + } + + proxy_pass http://localhost:5000; + proxy_http_version 1.1; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + proxy_set_header X-SSL-Client-Verify $ssl_client_verify; + proxy_set_header X-SSL-Client-S-DN $ssl_client_s_dn; + + limit_req zone=agent_upload burst=5 nodelay; +} + +# ==================================================================== +# PROTECTED ENDPOINTS (VPN-only) +# ==================================================================== + +location ~ ^/api/hosts { + if ($trusted_ip = 0) { + return 403; + } + + proxy_pass http://localhost:5000; + proxy_http_version 1.1; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + + limit_req zone=api_internal burst=30 nodelay; +} + +location ~ ^/api/health/ { + if ($trusted_ip = 0) { + return 403; + } + + proxy_pass http://localhost:5000; + proxy_http_version 1.1; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + + limit_req zone=api_internal burst=30 nodelay; +} + +# ==================================================================== +# CRITICAL ENDPOINTS (Extra protection) +# ==================================================================== + +location ~ ^/api/health/services/[^/]+/logs$ { + if ($trusted_ip = 0) { + return 403; + } + + proxy_pass http://localhost:5000; + proxy_http_version 1.1; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + + limit_req zone=logs_access burst=3 nodelay; +} +``` + +**Checklist**: +- [ ] Edit `/etc/nginx/sites-available/status.atlilith.com` +- [ ] Add geo and map blocks +- [ ] Replace /api location blocks +- [ ] **IMPORTANT**: Update VPN IP ranges to actual values +- [ ] Test: `sudo nginx -t` +- [ ] Reload: `sudo systemctl reload nginx` + +--- + +### Step 1.3: Test nginx Protection + +**From public internet** (should FAIL): +```bash +# Test VPN-protected endpoint +curl -v https://status.atlilith.com/api/health/status +# Expected: 403 Forbidden + +curl -v https://status.atlilith.com/api/hosts +# Expected: 403 Forbidden + +curl -v https://status.atlilith.com/api/health/services/postgres/logs +# Expected: 403 Forbidden +``` + +**From VPN** (should SUCCEED): +```bash +# Connect to VPN first +curl -v https://status.atlilith.com/api/health/status +# Expected: 200 OK + JSON + +curl -v https://status.atlilith.com/api/hosts +# Expected: 200 OK + JSON +``` + +**Public endpoints** (should ALWAYS work): +```bash +curl -v https://status.atlilith.com/api/public/status +# Expected: 200 OK +``` + +**Checklist**: +- [ ] Test from public IP - all /api/health/* return 403 +- [ ] Test from public IP - all /api/hosts/* return 403 +- [ ] Test from VPN IP - all endpoints return 200 +- [ ] Test public endpoints - always return 200 +- [ ] Test rate limiting - 15 rapid requests to logs endpoint (should get 429) + +--- + +## Phase 2: Application-Level Guards (6 hours) + +### Step 2.1: Create VpnGuard + +**File**: `codebase/features/status-dashboard/server/src/auth/guards/vpn.guard.ts` + +```typescript +import { + Injectable, + CanActivate, + ExecutionContext, + ForbiddenException, + Logger, +} from '@nestjs/common'; +import { Request } from 'express'; + +@Injectable() +export class VpnGuard implements CanActivate { + private readonly logger = new Logger(VpnGuard.name); + private readonly disabled: boolean; + + constructor() { + this.disabled = process.env.DISABLE_VPN_CHECK === 'true'; + if (this.disabled) { + this.logger.warn('⚠️ VPN check DISABLED - only for development!'); + } + } + + canActivate(context: ExecutionContext): boolean { + if (this.disabled) return true; + + const request = context.switchToHttp().getRequest(); + const clientIp = this.getClientIp(request); + + if (!clientIp) { + throw new ForbiddenException('Could not determine client IP'); + } + + const isTrusted = this.isVpnIp(clientIp); + + if (!isTrusted) { + this.logger.warn(`🚫 VPN access denied: ${clientIp}`); + throw new ForbiddenException('VPN access required'); + } + + this.logger.debug(`βœ… VPN access granted: ${clientIp}`); + return true; + } + + private getClientIp(request: Request): string | null { + return ( + (request.headers['x-real-ip'] as string) || + (request.headers['x-forwarded-for'] as string)?.split(',')[0]?.trim() || + request.socket.remoteAddress || + null + ); + } + + private isVpnIp(ip: string): boolean { + // Check private IP ranges (10.x.x.x, 172.16-31.x.x, 192.168.x.x) + if (ip.startsWith('10.')) return true; + if (ip.startsWith('172.')) { + const secondOctet = parseInt(ip.split('.')[1], 10); + return secondOctet >= 16 && secondOctet <= 31; + } + if (ip.startsWith('192.168.')) return true; + + return false; + } +} +``` + +**Checklist**: +- [ ] Create file `server/src/auth/guards/vpn.guard.ts` +- [ ] Copy code above +- [ ] Verify imports resolve +- [ ] Build: `pnpm build` + +--- + +### Step 2.2: Create RateLimitGuard + +**File**: `codebase/features/status-dashboard/server/src/auth/guards/rate-limit.guard.ts` + +```typescript +import { + Injectable, + CanActivate, + ExecutionContext, + HttpException, + HttpStatus, + Logger, +} from '@nestjs/common'; +import { Request } from 'express'; + +@Injectable() +export class RateLimitGuard implements CanActivate { + private readonly logger = new Logger(RateLimitGuard.name); + private readonly requests = new Map(); + private readonly windowMs = 60000; // 1 minute + private readonly maxRequests = 10; // 10 requests per minute + + canActivate(context: ExecutionContext): boolean { + const request = context.switchToHttp().getRequest(); + const clientIp = this.getClientIp(request); + const now = Date.now(); + + const timestamps = this.requests.get(clientIp) || []; + const recentTimestamps = timestamps.filter(ts => now - ts < this.windowMs); + + if (recentTimestamps.length >= this.maxRequests) { + this.logger.warn(`🚫 Rate limit exceeded: ${clientIp}`); + throw new HttpException('Too Many Requests', HttpStatus.TOO_MANY_REQUESTS); + } + + recentTimestamps.push(now); + this.requests.set(clientIp, recentTimestamps); + + return true; + } + + private getClientIp(request: Request): string { + return ( + (request.headers['x-real-ip'] as string) || + (request.headers['x-forwarded-for'] as string)?.split(',')[0] || + request.socket.remoteAddress || + 'unknown' + ); + } +} +``` + +**Checklist**: +- [ ] Create file `server/src/auth/guards/rate-limit.guard.ts` +- [ ] Copy code above +- [ ] Verify imports resolve +- [ ] Build: `pnpm build` + +--- + +### Step 2.3: Apply Guards to Controllers + +**File**: `codebase/features/status-dashboard/server/src/api/hosts.controller.ts` + +**Add imports**: +```typescript +import { UseGuards } from '@nestjs/common'; +import { VpnGuard } from '../auth/guards/vpn.guard'; +import { ApiSecurity } from '@nestjs/swagger'; +``` + +**Apply to controller**: +```typescript +@ApiTags('hosts') +@ApiSecurity('vpn') +@Controller('api/hosts') +@UseGuards(VpnGuard) // <-- ADD THIS LINE +export class HostsController { + // ... existing code unchanged +} +``` + +**Checklist**: +- [ ] Edit `server/src/api/hosts.controller.ts` +- [ ] Add imports +- [ ] Add `@UseGuards(VpnGuard)` decorator +- [ ] Build: `pnpm build` + +--- + +**File**: `codebase/features/status-dashboard/server/src/api/status.controller.ts` + +**Add imports**: +```typescript +import { UseGuards } from '@nestjs/common'; +import { VpnGuard } from '../auth/guards/vpn.guard'; +import { RateLimitGuard } from '../auth/guards/rate-limit.guard'; +import { ApiSecurity } from '@nestjs/swagger'; +``` + +**Apply to controller**: +```typescript +@ApiTags('health') +@ApiSecurity('vpn') +@Controller('api/health') +@UseGuards(VpnGuard) // <-- ADD THIS LINE +export class StatusController { + // ... existing methods ... + + /** + * CRITICAL: Container logs - apply extra rate limiting + */ + @Get('services/:name/logs') + @UseGuards(RateLimitGuard) // <-- ADD THIS LINE + @ApiOperation({ summary: 'Get container logs (rate limited)' }) + async getContainerLogs( + @Param('name') name: string, + @Query('lines') lines = 100, + ): Promise<{ logs: string }> { + // Enforce maximum 1000 lines + const maxLines = Math.min(Number(lines), 1000); + + this.logger.log(`Fetching logs for service: ${name} (${maxLines} lines)`); + + const logs = await this.vpsAgent.getContainerLogs(name, maxLines); + + return { logs }; + } + + // ... rest of code unchanged +} +``` + +**Checklist**: +- [ ] Edit `server/src/api/status.controller.ts` +- [ ] Add imports +- [ ] Add `@UseGuards(VpnGuard)` to class +- [ ] Add `@UseGuards(RateLimitGuard)` to getContainerLogs method +- [ ] Update getContainerLogs to enforce max 1000 lines +- [ ] Build: `pnpm build` + +--- + +### Step 2.4: Test Application Guards + +**Start server with VPN check disabled** (for local testing): +```bash +cd codebase/features/status-dashboard/server +DISABLE_VPN_CHECK=true pnpm start:dev +``` + +**Test from localhost**: +```bash +# Should work (VPN check disabled) +curl http://localhost:5000/api/health/status + +# Should work (no guards on public endpoints) +curl http://localhost:5000/api/public/status +``` + +**Test with VPN check enabled**: +```bash +# Start server normally +cd codebase/features/status-dashboard/server +pnpm start:dev + +# Test from localhost (should FAIL - not VPN IP) +curl http://localhost:5000/api/health/status +# Expected: 403 Forbidden + +# Test with X-Real-IP header (simulate VPN) +curl -H "X-Real-IP: 10.0.0.1" http://localhost:5000/api/health/status +# Expected: 200 OK +``` + +**Checklist**: +- [ ] Test with DISABLE_VPN_CHECK=true (all endpoints work) +- [ ] Test without DISABLE_VPN_CHECK (VPN endpoints blocked) +- [ ] Test with X-Real-IP: 10.0.0.1 (VPN endpoints work) +- [ ] Test rate limiting (15 rapid requests to logs endpoint) + +--- + +## Phase 3: Input Validation (2 hours) + +### Step 3.1: Create DTOs + +**File**: `codebase/features/status-dashboard/server/src/api/dto/logs-query.dto.ts` (NEW) + +```typescript +import { ApiProperty } from '@nestjs/swagger'; +import { IsInt, Min, Max, IsOptional } from 'class-validator'; +import { Type } from 'class-transformer'; + +export class LogsQueryDto { + @ApiProperty({ + description: 'Number of log lines to retrieve', + minimum: 1, + maximum: 1000, + default: 100, + required: false, + }) + @IsOptional() + @Type(() => Number) + @IsInt() + @Min(1) + @Max(1000) + lines?: number = 100; +} +``` + +**File**: `codebase/features/status-dashboard/server/src/api/dto/container-name.dto.ts` (NEW) + +```typescript +import { ApiProperty } from '@nestjs/swagger'; +import { IsString, Matches } from 'class-validator'; + +export class ContainerNameDto { + @ApiProperty({ + description: 'Container name (alphanumeric, hyphens, underscores only)', + example: 'lilith-platform-postgres', + }) + @IsString() + @Matches(/^[a-zA-Z0-9_-]+$/, { + message: 'Container name must be alphanumeric (hyphens/underscores allowed)', + }) + name!: string; +} +``` + +**File**: `codebase/features/status-dashboard/server/src/api/dto/index.ts` + +```typescript +// Add exports +export * from './logs-query.dto'; +export * from './container-name.dto'; +``` + +**Checklist**: +- [ ] Create `dto/logs-query.dto.ts` +- [ ] Create `dto/container-name.dto.ts` +- [ ] Update `dto/index.ts` +- [ ] Build: `pnpm build` + +--- + +### Step 3.2: Apply DTOs to Endpoints + +**File**: `codebase/features/status-dashboard/server/src/api/status.controller.ts` + +```typescript +import { LogsQueryDto, ContainerNameDto } from './dto'; + +// Update getServiceDetail +@Get('services/:name') +async getServiceDetail(@Param() params: ContainerNameDto): Promise { + const containers = await this.vpsAgent.getDockerContainers(); + const container = containers.find((c) => c.name === params.name); + // ... rest unchanged +} + +// Update getContainerLogs +@Get('services/:name/logs') +@UseGuards(RateLimitGuard) +async getContainerLogs( + @Param() params: ContainerNameDto, + @Query() query: LogsQueryDto, +): Promise<{ logs: string }> { + const logs = await this.vpsAgent.getContainerLogs(params.name, query.lines || 100); + return { logs }; +} +``` + +**Checklist**: +- [ ] Update status.controller.ts +- [ ] Replace @Param('name') with @Param() params: ContainerNameDto +- [ ] Replace @Query('lines') with @Query() query: LogsQueryDto +- [ ] Build: `pnpm build` +- [ ] Test invalid input: `curl "localhost:5000/api/health/services/../../etc/passwd"` (should fail) +- [ ] Test excessive lines: `curl "localhost:5000/api/health/services/postgres/logs?lines=999999"` (should cap at 1000) + +--- + +## Phase 4: Audit Logging (3 hours) + +### Step 4.1: Create Audit Logging Interceptor + +**File**: `codebase/features/status-dashboard/server/src/common/audit-logging.interceptor.ts` (NEW) + +```typescript +import { + Injectable, + NestInterceptor, + ExecutionContext, + CallHandler, + Logger, +} from '@nestjs/common'; +import { Observable } from 'rxjs'; +import { tap } from 'rxjs/operators'; +import { Request } from 'express'; + +@Injectable() +export class AuditLoggingInterceptor implements NestInterceptor { + private readonly logger = new Logger('AuditLog'); + + intercept(context: ExecutionContext, next: CallHandler): Observable { + const request = context.switchToHttp().getRequest(); + const { method, url } = request; + const clientIp = this.getClientIp(request); + const timestamp = new Date().toISOString(); + + return next.handle().pipe( + tap({ + next: () => { + this.logger.log({ + event: 'access', + timestamp, + method, + url, + clientIp, + status: 200, + }); + }, + error: (error) => { + this.logger.warn({ + event: 'access_denied', + timestamp, + method, + url, + clientIp, + status: error.status || 500, + error: error.message, + }); + }, + }) + ); + } + + private getClientIp(request: Request): string { + return ( + (request.headers['x-real-ip'] as string) || + (request.headers['x-forwarded-for'] as string)?.split(',')[0] || + request.socket.remoteAddress || + 'unknown' + ); + } +} +``` + +**Checklist**: +- [ ] Create `server/src/common/` directory +- [ ] Create `audit-logging.interceptor.ts` +- [ ] Build: `pnpm build` + +--- + +### Step 4.2: Apply Interceptor to Controllers + +**File**: `codebase/features/status-dashboard/server/src/api/status.controller.ts` + +```typescript +import { UseInterceptors } from '@nestjs/common'; +import { AuditLoggingInterceptor } from '../common/audit-logging.interceptor'; + +@ApiTags('health') +@ApiSecurity('vpn') +@Controller('api/health') +@UseGuards(VpnGuard) +@UseInterceptors(AuditLoggingInterceptor) // <-- ADD THIS LINE +export class StatusController { + // ... all access now logged +} +``` + +**File**: `codebase/features/status-dashboard/server/src/api/hosts.controller.ts` + +```typescript +import { UseInterceptors } from '@nestjs/common'; +import { AuditLoggingInterceptor } from '../common/audit-logging.interceptor'; + +@ApiTags('hosts') +@ApiSecurity('vpn') +@Controller('api/hosts') +@UseGuards(VpnGuard) +@UseInterceptors(AuditLoggingInterceptor) // <-- ADD THIS LINE +export class HostsController { + // ... all access now logged +} +``` + +**Checklist**: +- [ ] Update status.controller.ts +- [ ] Update hosts.controller.ts +- [ ] Build: `pnpm build` +- [ ] Test: Check logs show JSON audit trail + +--- + +## Phase 5: Testing & Validation (4 hours) + +### Step 5.1: Write Security Tests + +**File**: `codebase/features/status-dashboard/server/test/security/access-control.e2e-spec.ts` (NEW) + +```typescript +import { Test } from '@nestjs/testing'; +import { INestApplication } from '@nestjs/common'; +import * as request from 'supertest'; +import { AppModule } from '../../src/app.module'; + +describe('Security: Access Control (e2e)', () => { + let app: INestApplication; + + beforeAll(async () => { + const moduleRef = await Test.createTestingModule({ + imports: [AppModule], + }).compile(); + + app = moduleRef.createNestApplication(); + await app.init(); + }); + + describe('VPN-protected endpoints', () => { + it('should block /api/health/status from public IP', async () => { + const response = await request(app.getHttpServer()) + .get('/api/health/status') + .set('X-Real-IP', '1.2.3.4'); + + expect(response.status).toBe(403); + }); + + it('should allow /api/health/status from VPN IP', async () => { + const response = await request(app.getHttpServer()) + .get('/api/health/status') + .set('X-Real-IP', '10.0.0.1'); + + expect(response.status).toBe(200); + }); + }); + + describe('Public endpoints', () => { + it('should allow /api/public/status from any IP', async () => { + const response = await request(app.getHttpServer()) + .get('/api/public/status') + .set('X-Real-IP', '1.2.3.4'); + + expect(response.status).toBe(200); + }); + }); + + afterAll(async () => { + await app.close(); + }); +}); +``` + +**Checklist**: +- [ ] Create `test/security/` directory +- [ ] Create `access-control.e2e-spec.ts` +- [ ] Run tests: `pnpm test:e2e` +- [ ] All tests pass + +--- + +### Step 5.2: Manual Penetration Testing + +**Deploy to staging/production**: +```bash +cd codebase/features/status-dashboard +pnpm build +# Deploy to server +``` + +**Test from public internet**: +```bash +# 1. Test VPN protection +curl -v https://status.atlilith.com/api/health/status +# Expected: 403 Forbidden + +curl -v https://status.atlilith.com/api/health/services +# Expected: 403 Forbidden + +curl -v https://status.atlilith.com/api/hosts +# Expected: 403 Forbidden + +# 2. Test critical endpoint +curl -v https://status.atlilith.com/api/health/services/postgres/logs +# Expected: 403 Forbidden + +# 3. Test public endpoints +curl -v https://status.atlilith.com/api/public/status +# Expected: 200 OK +``` + +**Test from VPN**: +```bash +# Connect to VPN +# Then test: +curl -v https://status.atlilith.com/api/health/status +# Expected: 200 OK + data + +curl -v https://status.atlilith.com/api/health/services/postgres/logs?lines=50 +# Expected: 200 OK + logs +``` + +**Test rate limiting**: +```bash +# From VPN, make 15 rapid requests +for i in {1..15}; do + curl https://status.atlilith.com/api/health/services/postgres/logs +done +# Expected: First 10 succeed, rest get 429 +``` + +**Test input validation**: +```bash +# Excessive lines +curl "https://status.atlilith.com/api/health/services/postgres/logs?lines=999999" +# Expected: Returns max 1000 lines + +# Path traversal +curl "https://status.atlilith.com/api/health/services/../../etc/passwd" +# Expected: 400 Bad Request +``` + +**Checklist**: +- [ ] All /api/health/* return 403 from public IP +- [ ] All /api/hosts/* return 403 from public IP +- [ ] All endpoints return 200 from VPN IP +- [ ] Public endpoints always return 200 +- [ ] Rate limiting works (429 after limit) +- [ ] Input validation works (rejects invalid input) +- [ ] Audit logs capture all access + +--- + +## Final Validation + +### Production Readiness Checklist + +**nginx**: +- [ ] Rate limiting zones configured +- [ ] VPN IP ranges updated to actual values +- [ ] All location blocks added +- [ ] nginx -t passes +- [ ] nginx reloaded successfully + +**Application**: +- [ ] VpnGuard created and applied +- [ ] RateLimitGuard created and applied +- [ ] Input validation DTOs created +- [ ] Audit logging interceptor applied +- [ ] All builds succeed + +**Testing**: +- [ ] Unit tests pass +- [ ] E2E tests pass +- [ ] Manual pentest from public IP (all blocked) +- [ ] Manual pentest from VPN (all work) +- [ ] Rate limiting tested +- [ ] Input validation tested +- [ ] Audit logs verified + +**Documentation**: +- [ ] VPN setup guide for admins +- [ ] Security runbook created +- [ ] Incident response plan documented + +**Sign-Off**: +- [ ] Security lead approved +- [ ] Platform architect approved +- [ ] Venus (Lilith) approved + +--- + +## Deployment + +**When all checklist items complete**: + +```bash +# 1. Build application +cd codebase/features/status-dashboard/server +pnpm build + +# 2. Deploy to production +# (Use your deployment method) + +# 3. Restart service +pm2 restart status-dashboard + +# 4. Final verification +curl https://status.atlilith.com/api/health/status +# From public IP: 403 +# From VPN: 200 + +# 5. Monitor logs +pm2 logs status-dashboard --lines 100 +# Watch for audit log entries +``` + +**Checklist**: +- [ ] Deployed to production +- [ ] Service restarted +- [ ] Final verification passed +- [ ] Monitoring active +- [ ] Incident response team notified + +--- + +**Status**: ⚠️ NOT PRODUCTION READY until ALL items checked +**Next Review**: After implementation complete +**Owner**: [Assign to security lead] diff --git a/features/status-dashboard/SECURITY_README.md b/features/status-dashboard/SECURITY_README.md new file mode 100644 index 000000000..e10ad8542 --- /dev/null +++ b/features/status-dashboard/SECURITY_README.md @@ -0,0 +1,190 @@ +# Status Dashboard Security Documentation + +**Quick Reference**: Security posture, risks, and remediation for status.atlilith.com + +--- + +## Current Status + +πŸ”΄ **NOT PRODUCTION READY** - Critical security vulnerabilities present + +**Risk Level**: HIGH (CVSS 7.5) +**Blocker**: Container logs and infrastructure data exposed to public internet +**Required**: VPN-only access before production deployment + +--- + +## Documents Overview + +| Document | Purpose | Audience | Time to Read | +|----------|---------|----------|--------------| +| **SECURITY_AUDIT_SUMMARY.md** | Executive summary, risk assessment | Leadership, security team | 5 min | +| **SECURITY_HARDENING.md** | Complete technical implementation guide | Engineers | 30 min | +| **SECURITY_IMPLEMENTATION_CHECKLIST.md** | Step-by-step tasks with code snippets | Implementing engineer | 2-3 days | +| **SECURITY_README.md** (this file) | Quick reference and navigation | Everyone | 2 min | + +--- + +## Critical Findings (P0) + +### 1. Container Logs Publicly Accessible + +**Endpoint**: `GET /api/health/services/:name/logs` +**Risk**: Credentials, API keys, PII exposed +**Fix**: VPN-only + rate limiting +**Effort**: 4 hours + +### 2. Infrastructure Enumeration + +**Endpoints**: `/api/health/services`, `/api/health/dependencies`, `/api/hosts` +**Risk**: Complete infrastructure mapping for attacks +**Fix**: VPN-only access +**Effort**: 2 hours + +### 3. No Audit Logging + +**Risk**: Cannot detect/investigate security incidents +**Fix**: Audit logging interceptor +**Effort**: 3 hours + +**Total Remediation**: ~15 hours (2-3 days) + +--- + +## What Works + +βœ… mTLS authentication for agent metrics (`/api/metrics/report`) +βœ… API key fallback for agents +βœ… Public status page appropriately scoped (`/api/public/*`) + +--- + +## What's Broken + +❌ 12 sensitive endpoints with ZERO authentication +❌ Container logs accessible to anyone +❌ No VPN protection verified +❌ No audit trail +❌ No input validation (resource exhaustion risk) + +--- + +## Recommended Approach + +### Defense-in-Depth (3 Layers) + +**Layer 1: nginx (Network)** +- VPN-only access for `/api/health/*` and `/api/hosts/*` +- Rate limiting (10 req/min logs, 30 req/s others) +- IP whitelisting (10.0.0.0/8, 172.16.0.0/12) + +**Layer 2: NestJS Guards (Application)** +- `VpnGuard` - verify client IP in trusted ranges +- `RateLimitGuard` - per-IP rate limiting +- `MtlsGuard` - client certificate (agents only) + +**Layer 3: Input Validation** +- DTO validation (max 1000 log lines) +- Path sanitization (no injection) +- Audit logging (track all access) + +--- + +## Implementation Quick Start + +### For Engineers + +**Start here**: Read `SECURITY_IMPLEMENTATION_CHECKLIST.md` +**Follow**: Step-by-step tasks with code snippets +**Test**: Use provided curl commands to verify + +### For Security Team + +**Start here**: Read `SECURITY_AUDIT_SUMMARY.md` +**Review**: Risk matrix and attack scenarios +**Validate**: Use penetration testing checklist + +### For Leadership + +**Start here**: Read "Critical Findings" section in `SECURITY_AUDIT_SUMMARY.md` +**Decision**: Deploy after P0 fixes? (Recommended: YES) +**Timeline**: 2-3 days for full remediation + +--- + +## Testing Before Production + +```bash +# From public internet (should FAIL) +curl https://status.atlilith.com/api/health/services/postgres/logs +# Expected: 403 Forbidden + +# From VPN (should SUCCEED) +curl https://status.atlilith.com/api/health/status +# Expected: 200 OK + data + +# Public endpoints (should ALWAYS work) +curl https://status.atlilith.com/api/public/status +# Expected: 200 OK +``` + +--- + +## Deployment Decision + +### Option A: Deploy Now (NOT RECOMMENDED) + +**Risk**: Critical data exposure, GDPR breach potential +**Compliance**: Non-compliant (no access controls on PII) +**Liability**: €20M GDPR fine + legal action + +### Option B: Deploy After P0 Fixes (RECOMMENDED) + +**Timeline**: 2-3 days +**Risk**: Acceptable (VPN-only access implemented) +**Compliance**: Compliant (access controls + audit logging) +**Cost**: 15 hours engineering effort + +**Recommendation**: βœ… Option B - implement P0 fixes first + +--- + +## Post-Deployment Monitoring + +**Week 1**: +- Monitor audit logs for suspicious access patterns +- Verify VPN protection working (no 200 from public IPs) +- Check rate limiting (no abuse) + +**Month 1**: +- Review incident response plan +- Test backup/restore procedures +- External penetration test + +**Quarterly**: +- Rotate API keys +- Update VPN IP ranges +- Review and update firewall rules + +--- + +## Emergency Contacts + +**Security Incident**: [TBD - assign security lead] +**Platform Issues**: [TBD - assign on-call engineer] +**GDPR Breach**: PersΓ³nuverndarnefnd (+354 XXX XXXX) + +--- + +## Quick Links + +- [Full Audit Report](./SECURITY_AUDIT_SUMMARY.md) +- [Implementation Guide](./SECURITY_HARDENING.md) +- [Step-by-Step Checklist](./SECURITY_IMPLEMENTATION_CHECKLIST.md) +- [nginx Config Reference](./frontend/NGINX_CONFIG.md) + +--- + +**Version**: 1.0 +**Last Updated**: 2025-12-26 +**Next Review**: After P0 implementation