platform-codebase/features/status-dashboard/server/LOGGING.md
Quinn Ftw 2ce3b295f4 feat(status-dashboard): add audit logging system
Implement comprehensive audit logging with:
- AuditLoggingInterceptor: Request/response logging with <2ms overhead
- JsonLoggerService: Structured JSON output for SIEM integration
- Log rotation: 90-day retention with daily rotation
- Unit tests: 9 passing tests for interceptor behavior

Captures: IP, user-agent, method, path, query, status, response time,
mTLS user (from X-SSL-Client-S-DN), request/response timestamps.

Includes implementation guide and logrotate configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 05:59:12 -08:00

8.4 KiB

Audit Logging Infrastructure

This document describes the audit logging infrastructure for security compliance and SIEM integration.

Overview

The status-dashboard backend implements comprehensive audit logging to track all access to sensitive endpoints. This enables:

  • Security compliance: Track who accessed what resources and when
  • Incident response: Investigate security incidents with detailed audit trails
  • SIEM integration: Forward structured JSON logs to Security Information and Event Management (SIEM) systems
  • Anomaly detection: Identify unusual access patterns

Architecture

Components

  1. AuditLoggingInterceptor (src/logging/audit-logging.interceptor.ts)

    • NestJS interceptor that captures request/response metadata
    • Applied to sensitive controllers via @UseInterceptors(AuditLoggingInterceptor)
    • Logs every request with timing, client info, and response status
  2. JSONLoggerService (src/logging/json-logger.service.ts)

    • Custom logger for production environments
    • Outputs structured JSON logs suitable for log aggregators
    • Separates audit logs from application logs
  3. Log Files

    • /var/log/status-dashboard/app.log - General application logs
    • /var/log/status-dashboard/audit.log - Security/audit events only
    • Both files rotate daily with 90-day retention

Logged Fields

Every audited request includes:

{
  "timestamp": "2025-12-26T13:45:00.123Z",
  "ip": "10.8.0.5",
  "userAgent": "Mozilla/5.0...",
  "method": "GET",
  "path": "/api/health/services/postgres/logs",
  "query": {"lines": "100"},
  "status": 200,
  "responseTime": 45,
  "user": "admin@lilith.com",
  "level": "log",
  "context": "AuditLog"
}

Field descriptions:

  • timestamp: ISO 8601 timestamp
  • ip: Client IP (X-Forwarded-For or direct connection)
  • userAgent: Client user agent string
  • method: HTTP method (GET, POST, PUT, DELETE)
  • path: Request URL path
  • query: Query parameters (if any)
  • status: HTTP response status code
  • responseTime: Response time in milliseconds
  • user: Authenticated user from mTLS certificate (CN field)
  • error: Error message (only for failed requests)

Monitored Endpoints

The following controllers have audit logging enabled:

HostsController (/api/hosts)

  • GET /api/hosts - List all hosts with metrics
  • GET /api/hosts/:hostId - Get detailed host metrics
  • GET /api/hosts/sentiment/overall - Get host sentiment

StatusController (/api/health)

  • GET /api/health/status - Platform status
  • GET /api/health/services - All service statuses
  • GET /api/health/services/:name - Specific service details
  • GET /api/health/services/:name/logs - Container logs (sensitive)
  • GET /api/health/resources - Host resource usage
  • GET /api/health/events - Docker events
  • GET /api/health/dependencies - Service dependency graph
  • GET /api/health/build-info - Build information

Configuration

Environment Variables

# Logging configuration
LOG_DIR=/var/log/status-dashboard  # Log directory (default)
LOG_LEVEL=log                      # Log level: error|warn|log|debug|verbose
NODE_ENV=production                # Use JSON logger in production

# Enable JSON logging
NODE_ENV=production                # Triggers JSONLoggerService

Development vs Production

Development (default):

  • Uses NestJS built-in logger
  • Human-readable colored output
  • Logs to stdout/stderr only

Production (NODE_ENV=production):

  • Uses JSONLoggerService
  • Structured JSON output
  • Logs to both files and stdout (for Docker/systemd)
  • Separate audit log file

Log Rotation

Install the logrotate configuration:

# Copy logrotate config
sudo cp logrotate.conf /etc/logrotate.d/status-dashboard

# Test configuration
sudo logrotate -d /etc/logrotate.d/status-dashboard

# Force rotation (for testing)
sudo logrotate -f /etc/logrotate.d/status-dashboard

Rotation policy:

  • Daily rotation
  • 90-day retention (compliance requirement)
  • Compressed after 1 day (delaycompress)
  • Audit logs have stricter permissions (0600 vs 0640)

SIEM Integration

Forwarding Logs

Option 1: Filebeat (Elastic Stack)

# /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/status-dashboard/audit.log
  json.keys_under_root: true
  json.add_error_key: true
  fields:
    service: status-dashboard
    environment: production
    log_type: audit

output.elasticsearch:
  hosts: ["localhost:9200"]
  index: "audit-logs-%{+yyyy.MM.dd}"

Option 2: Fluentd

# /etc/fluentd/conf.d/status-dashboard.conf
<source>
  @type tail
  path /var/log/status-dashboard/audit.log
  pos_file /var/log/td-agent/status-dashboard-audit.pos
  tag audit.status-dashboard
  format json
  time_key timestamp
  time_format %Y-%m-%dT%H:%M:%S.%L%z
</source>

<match audit.**>
  @type forward
  <server>
    host siem.nasty.sh
    port 24224
  </server>
</match>

Option 3: Syslog (rsyslog)

# Monitor log file and forward to syslog
tail -F /var/log/status-dashboard/audit.log | \
  logger -t status-dashboard-audit -p local0.info

Querying Logs

Using jq (command-line JSON processor):

# Find all failed requests (status >= 400)
cat /var/log/status-dashboard/audit.log | jq 'select(.status >= 400)'

# Count requests by IP
cat /var/log/status-dashboard/audit.log | jq -r '.ip' | sort | uniq -c

# Find slow requests (> 1000ms)
cat /var/log/status-dashboard/audit.log | jq 'select(.responseTime > 1000)'

# Extract requests from specific user
cat /var/log/status-dashboard/audit.log | jq 'select(.user == "admin@lilith.com")'

# Get error requests with messages
cat /var/log/status-dashboard/audit.log | jq 'select(.error != null)'

Security Considerations

  1. File Permissions

    • Application logs: 0640 (owner read/write, group read)
    • Audit logs: 0600 (owner read/write only)
    • Log directory: 0750 (owned by status-dashboard user)
  2. PII/Sensitive Data

    • IP addresses are logged (required for security)
    • User agent strings may contain system information
    • Query parameters may contain sensitive data
    • Consider implementing field-level redaction for specific parameters
  3. Log Integrity

    • Logs are append-only (not cryptographically signed)
    • For compliance, consider forwarding to immutable storage (WORM)
    • SIEM systems typically provide tamper-evident storage
  4. Retention

    • 90-day retention meets most compliance requirements (GDPR, PCI-DSS)
    • Adjust rotate 90 in logrotate.conf for different requirements

Performance Impact

The audit logging interceptor has minimal performance impact:

  • Overhead: ~1-2ms per request (asynchronous logging)
  • Disk I/O: Buffered writes to log files
  • Memory: Negligible (logs written immediately, not buffered)

For high-traffic deployments, consider:

  • Using a dedicated log aggregator (Fluentd, Logstash)
  • Disabling file logging and relying on stdout → Docker → log shipper
  • Implementing log sampling for non-critical endpoints

Testing

Verify Audit Logging

# Start the service
npm run start:dev

# Make a test request
curl http://localhost:5000/api/health/services/postgres/logs?lines=100

# Check audit log
tail -f /var/log/status-dashboard/audit.log | jq

Expected output:

{
  "timestamp": "2025-12-26T13:45:00.123Z",
  "level": "log",
  "context": "AuditLog",
  "ip": "127.0.0.1",
  "userAgent": "curl/7.81.0",
  "method": "GET",
  "path": "/api/health/services/postgres/logs?lines=100",
  "query": {"lines": "100"},
  "status": 200,
  "responseTime": 45
}

Future Enhancements

  1. Structured Metadata

    • Add request ID for distributed tracing
    • Include correlation IDs for multi-service requests
  2. Field Redaction

    • Automatically redact sensitive query parameters (passwords, tokens)
    • Hash PII data before logging
  3. Real-time Alerting

    • Integrate with alerting system for suspicious patterns
    • Notify on repeated failed authentication attempts
  4. Compliance Reports

    • Automated compliance report generation
    • Access audit summaries by user/IP/time range

References