TransQuinnFTW 01011c97ab chore(src): 🔧 Update documentation files in src directory (12 markdown files)

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>

2026-02-11 23:08:02 -08:00

22 KiB

Raw Permalink Blame History

VibeCheck Architecture

Version: 0.1.0 Last Updated: 2026-02-06

Overview
System Architecture
Core Components
Data Flow
MediaPipe Integration
Privacy Architecture
Technical Decisions
Performance Considerations

Overview

VibeCheck is a privacy-first liveness detection system built entirely for client-side execution. The architecture is designed around a core principle: no biometric data ever leaves the user's browser.

Design Principles

Privacy by Architecture: Biometric processing is architecturally isolated to the client
Open Source Transparency: All processing logic is auditable
Minimal Data Transfer: Only boolean results cross the network boundary
Progressive Enhancement: Works without server-side components
Framework Agnostic Core: Vanilla TypeScript core with framework adapters

System Architecture

┌────────────────────────────────────────────────────────────┐
│                      User's Browser                         │
│                     (Client-Side Only)                      │
├────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐         ┌─────────────────┐              │
│  │   Webcam     │────────▶│  MediaPipe      │              │
│  │  (getUserMedia)        │  Face Landmarker│              │
│  └──────────────┘         └────────┬────────┘              │
│                                     │                       │
│                                     ▼                       │
│                          ┌─────────────────┐               │
│                          │  Liveness       │               │
│                          │  Detection      │               │
│                          │  Engine         │               │
│                          └────────┬────────┘               │
│                                   │                        │
│         ┌─────────────────────────┴─────────────┐          │
│         │                                       │          │
│         ▼                                       ▼          │
│  ┌─────────────┐                      ┌─────────────┐     │
│  │  Blink      │                      │  Head       │     │
│  │  Detector   │                      │  Movement   │     │
│  └──────┬──────┘                      └──────┬──────┘     │
│         │                                    │            │
│         │            ┌───────────┐           │            │
│         └───────────▶│  Result   │◀──────────┘            │
│                      │  Computer │                        │
│                      └─────┬─────┘                        │
│                            │                              │
│                            ▼                              │
│                    { isLive: boolean,                     │
│                      confidence: number,                  │
│                      timestamp: number }                  │
│                                                            │
└────────────────────────────┬───────────────────────────────┘
                             │
                             │ HTTPS (result only)
                             │ ❌ No video
                             │ ❌ No images
                             │ ❌ No biometric data
                             │
                             ▼
                ┌────────────────────────┐
                │   Your Server          │
                │   (Optional)           │
                ├────────────────────────┤
                │ • Validate timestamp   │
                │ • Rate limiting        │
                │ • Store result         │
                │ • Proceed with flow    │
                └────────────────────────┘

Core Components

1. Core Library (`@lilithftw/vibecheck-core`)

The foundation of VibeCheck, providing framework-agnostic liveness detection.

Key Classes:

`LivenessDetector`

The main detection engine that orchestrates the liveness check process.

class LivenessDetector {
  constructor(options?: LivenessOptions);

  // Initialize MediaPipe and webcam
  async initialize(): Promise<void>;

  // Start the liveness detection check
  async check(): Promise<LivenessResult>;

  // Clean up resources
  cleanup(): void;
}

Responsibilities:

MediaPipe initialization and lifecycle management
Webcam stream acquisition and management
Orchestration of detection algorithms
Result computation and validation

`BlinkDetector`

Specialized module for detecting eye blinks using facial landmarks.

Algorithm:

Track eye aspect ratio (EAR) over time
Detect EAR threshold crossings (open → closed → open)
Validate blink duration (too fast = invalid, too slow = invalid)
Count valid blinks within time window

`HeadMovementDetector`

Detects deliberate head movements (turn left/right, nod up/down).

Algorithm:

Track nose landmark position over time
Calculate movement vectors (horizontal/vertical)
Detect significant directional changes
Filter out micro-movements and jitter

`DepthEstimator`

Estimates facial depth using landmark geometry to detect spoofing attempts.

Algorithm:

Calculate inter-landmark distances
Build 3D geometry model from 2D landmarks
Analyze depth consistency over time
Flag suspicious flat/planar faces (photos)

2. React Component (`@lilithftw/vibecheck-react`)

React-specific wrapper providing hooks and components.

Key Components:

`<VibeCheck />`

High-level component with built-in UI.

interface VibeCheckProps {
  onSuccess: (result: LivenessResult) => void;
  onFailure: (error: LivenessError) => void;
  onStatusChange?: (status: CheckStatus) => void;
  config?: LivenessOptions;
  theme?: 'light' | 'dark' | Theme;
}

`useVibeCheck()` Hook

Headless hook for custom UI implementations.

interface UseVibeCheckReturn {
  isInitialized: boolean;
  isChecking: boolean;
  result: LivenessResult | null;
  error: LivenessError | null;
  startCheck: () => Promise<void>;
  reset: () => void;
}

Data Flow

1. Initialization Phase

User clicks "Start Check"
         │
         ▼
┌────────────────────┐
│ Request camera     │
│ permissions        │
└────────┬───────────┘
         │
         ▼
┌────────────────────┐
│ Initialize         │
│ MediaPipe          │
│ (download models)  │
└────────┬───────────┘
         │
         ▼
┌────────────────────┐
│ Start video stream │
└────────────────────┘

Network Activity:

MediaPipe model files (~2-3 MB, cached after first load)
No data sent to external servers

2. Detection Phase

Video Frame (60fps)
         │
         ▼
┌────────────────────┐
│ MediaPipe          │
│ Face Detection     │
└────────┬───────────┘
         │
         ▼
478 facial landmarks (2D coordinates)
         │
         ├──────┬──────────────┬─────────┐
         ▼      ▼              ▼         ▼
    ┌──────┐ ┌────────┐  ┌─────────┐ ┌──────┐
    │Blink │ │Head    │  │Depth    │ │Other │
    │Det.  │ │Movement│  │Estimate │ │Cues  │
    └───┬──┘ └───┬────┘  └────┬────┘ └───┬──┘
        │        │            │          │
        └────────┴────────────┴──────────┘
                     │
                     ▼
            ┌────────────────┐
            │ Confidence     │
            │ Aggregator     │
            └────────┬───────┘
                     │
                     ▼
            { isLive: true/false,
              confidence: 0.0-1.0 }

Data Locality:

All processing happens in browser memory
Landmarks never serialized or stored
Video frames never leave WebRTC pipeline

3. Result Phase

Detection Complete
         │
         ▼
┌────────────────────┐
│ Cleanup resources  │
│ • Stop camera      │
│ • Release MediaPipe│
│ • Clear buffers    │
└────────┬───────────┘
         │
         ▼
┌────────────────────┐
│ Return result      │
│ {                  │
│   isLive: boolean, │
│   confidence: num, │
│   timestamp: num   │
│ }                  │
└────────┬───────────┘
         │
         ▼
   Application code
   (your callback)

What's Transmitted:

Boolean flag (1 bit conceptually, ~10 bytes JSON)
Confidence score (~8 bytes)
Timestamp (~8 bytes)
Total: ~26 bytes of non-biometric data

MediaPipe Integration

Face Landmarker Model

VibeCheck uses MediaPipe's Face Landmarker, which provides:

478 3D facial landmarks
Face blendshapes (52 coefficients)
Face geometry (transformation matrices)

Model Details:

Type: face_landmarker.task
Size: ~2.5 MB (gzipped)
Framework: TensorFlow Lite
Inference: WebAssembly + WebGL

Integration Pattern

import { FaceLandmarker, FilesetResolver } from '@mediapipe/tasks-vision';

// Initialize (done once per session)
const vision = await FilesetResolver.forVisionTasks(
  'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision/wasm'
);

const faceLandmarker = await FaceLandmarker.createFromOptions(vision, {
  baseOptions: {
    modelAssetPath: 'https://storage.googleapis.com/mediapipe-models/...',
    delegate: 'GPU' // Use WebGL acceleration
  },
  runningMode: 'VIDEO',
  numFaces: 1,
  outputFaceBlendshapes: true,
  outputFacialTransformationMatrixes: true
});

// Per-frame processing
const results = faceLandmarker.detectForVideo(videoElement, timestamp);

Performance Optimization

GPU Acceleration: Use WebGL delegate when available
Single Face Mode: numFaces: 1 reduces overhead
Selective Outputs: Only request needed data (blendshapes, matrices)
Frame Skipping: Process every 2nd-3rd frame for 30fps target
Warm-up: Pre-initialize model before user interaction

Privacy Architecture

Isolation Boundaries

┌─────────────────────────────────────────┐
│         Browser Sandbox                  │
│                                          │
│  ┌────────────────────────────────┐     │
│  │    WebRTC/getUserMedia         │     │
│  │    (Camera Access)             │     │
│  └──────────┬─────────────────────┘     │
│             │                            │
│             ▼                            │
│  ┌────────────────────────────────┐     │
│  │    JavaScript Memory           │     │
│  │    • Video frames (volatile)   │     │
│  │    • Landmarks (volatile)      │     │
│  │    • Processing buffers        │     │
│  └──────────┬─────────────────────┘     │
│             │                            │
│             ▼                            │
│  ┌────────────────────────────────┐     │
│  │    Result Computation          │     │
│  │    (Boolean + Metadata only)   │     │
│  └──────────┬─────────────────────┘     │
│             │                            │
└─────────────┼────────────────────────────┘
              │
              │ Network Boundary
              │ (TLS encrypted)
              │
              ▼
    ┌─────────────────┐
    │  Your Server    │
    │  (Receives only │
    │   boolean +     │
    │   metadata)     │
    └─────────────────┘

Security Properties

No Server-Side Processing: Eliminates server breach risk
No Data Persistence: Video frames never touch disk
No Network Transmission: Biometric data never serialized
Open Source Auditing: All processing logic is public
Local Computation: Works offline after model download

Verification Methods

Users can verify privacy claims by:

Network Inspection: Monitor DevTools Network tab (only sees result JSON)
Source Code Audit: Review open-source implementation
Traffic Analysis: Use Wireshark/mitmproxy to inspect HTTPS traffic
Local Testing: Run checks with network disabled (works after initial load)

Technical Decisions

Why Client-Side Only?

Advantages:

✅ Maximum privacy (no biometric data exposure)
✅ Lower infrastructure costs (no GPU servers)
✅ Faster response (no network round-trip)
✅ Works offline (after model download)
✅ Scales infinitely (client resources)

Trade-offs:

⚠️ Requires modern browser (WebGL, WebAssembly)
⚠️ Client can be compromised (must supplement with server-side checks)
⚠️ Initial model download (~2.5 MB)

Why MediaPipe?

Alternatives Considered:

TensorFlow.js: More flexible but requires custom model training
OpenCV.js: Powerful but large bundle size (~8 MB)
Face-api.js: Good but less maintained

MediaPipe Advantages:

✅ High accuracy (Google-trained models)
✅ Optimized for web (WASM + WebGL)
✅ Well-maintained by Google
✅ Production-ready performance
✅ Comprehensive landmark data (478 points)

Why TypeScript?

✅ Type safety for complex geometric calculations
✅ Better IDE support for library consumers
✅ Compile-time error detection
✅ Self-documenting code with interfaces

Why Monorepo?

Structure:

packages/
├── core/          # Framework-agnostic logic
├── react/         # React adapter
├── vue/           # (Future) Vue adapter
├── svelte/        # (Future) Svelte adapter
└── demo/          # Interactive demo

Benefits:

✅ Shared TypeScript configs
✅ Coordinated releases
✅ Easier cross-package refactoring
✅ Single documentation source

Performance Considerations

Bundle Sizes

Package	File	Size	Gzipped (est.)
`@lilithftw/vibecheck-core`	`dist/index.js`	89 KB	~25 KB
`@lilithftw/vibecheck-core`	`dist/index.d.ts`	31 KB	—
`@lilithftw/vibecheck-react`	`dist/index.js`	53 KB	~15 KB
`@lilithftw/vibecheck-react`	`dist/index.d.ts`	23 KB	—

React package includes core as a dependency. Total JavaScript shipped to browser: ~142 KB (before tree-shaking), ~40 KB gzipped.

Network: WASM + Model Downloads

Resource	Size	Caching
MediaPipe WASM runtime	~1.5 MB	Browser-cached after first load
Face Landmarker model (`face_landmarker.task`)	~2.5 MB	Browser-cached after first load
Total first-load	~4 MB	Subsequent loads: 0 bytes (304 Not Modified)

These are downloaded from Google's CDN on first use and cached by the browser's standard HTTP cache. After initial download, VibeCheck operates with zero network overhead for model loading.

Initialization Time

Phase	Duration	Notes
WASM runtime load	100-300ms	From browser cache after first load
Model initialization	200-500ms	GPU delegate setup + model parsing
Camera permission	User-dependent	Browser permission prompt
Camera stream start	50-200ms	`getUserMedia` negotiation
Total (cached)	~400-1000ms	Excluding user permission interaction
Total (first load)	~2-5s	Including ~4 MB model download

Detection Latency (Per Frame)

Stage	Duration	Notes
MediaPipe inference	10-30ms	GPU-accelerated via WebGL
JavaScript analysis	1-5ms	Blink/head/depth calculations
Total per frame	~15-35ms	Fits within 30fps budget (33ms)

With frame throttling (processing every 2nd frame), actual CPU utilization is approximately 50% of these values during active detection.

Memory Footprint

Component	Memory	Notes
MediaPipe model (in memory)	~15 MB	TFLite model loaded into WASM heap
Video frame buffers	~5-10 MB	WebRTC internal buffers
JavaScript runtime objects	~1-2 MB	Detector state, landmark history
Base total	~20-27 MB	During active detection
After cleanup	~0 MB	All resources released

CPU Usage Profile

Idle: Near-zero (no processing before initialize())
Initializing: Brief spike during WASM compilation and model load
Active detection: 15-35ms per frame at 30fps (~50-100% of one core for processing frames)
After cleanup: Returns to zero

GPU usage via WebGL is preferred and significantly reduces CPU load. On devices without WebGL 2.0, the CPU delegate is used with higher latency (~50-100ms per frame).

Resource Usage Summary

CPU:

MediaPipe inference: ~10-30ms per frame (GPU accelerated)
JavaScript overhead: ~1-5ms per frame
Target: 30fps (33ms budget)

Memory:

MediaPipe model: ~15 MB in memory
Video frame buffers: ~5-10 MB
JavaScript objects: ~1-2 MB
Total: ~20-27 MB typical usage

Network:

Initial model download: ~4 MB total (one-time, cached)
Result transmission: ~26 bytes per check

Optimization Strategies

Lazy Loading: Load MediaPipe only when check starts
Model Caching: Use browser cache for model files
Frame Throttling: Process 30fps instead of 60fps
Early Exit: Stop processing once confidence threshold met
Worker Threads: Offload processing to Web Workers (future)
Warm-up: Pre-initialize model before user interaction for faster perceived start

Browser Compatibility

Required Features:

WebRTC (getUserMedia)
WebAssembly
WebGL 2.0
ES2020+ JavaScript

Supported Browsers:

Chrome/Edge 80+
Firefox 80+
Safari 15+
Opera 67+

Not Supported:

Internet Explorer (all versions)
Opera Mini
Browsers without WebGL 2.0

See the full Browser Support matrix in the API reference.

Scalability

Client-Side:

Infinite scalability (each client runs own processing)
No server bottlenecks

Server-Side (Optional):

Result storage: ~26 bytes per check
Rate limiting: Use Redis or in-memory cache
Validation: Stateless endpoint, horizontally scalable

Future Enhancements

Roadmap

Phase 1 (Current):

✅ Core liveness detection
✅ React component
✅ Basic blink/head movement

Phase 2 (Q2 2026):

Advanced spoofing detection (texture analysis)
Vue/Svelte adapters
Accessibility improvements (voice instructions)
Offline mode with service workers

Phase 3 (Q3 2026):

Web Worker support (non-blocking UI)
Advanced gestures (smile, eyebrow raise)
Multi-language support
WebGPU acceleration (when stable)

Phase 4 (Q4 2026):

Mobile SDK (React Native)
Server-side verification library
Analytics dashboard
Enterprise features

Research Areas

Presentation Attack Detection (PAD): Detect printed photos, video replays
Passive Liveness: Detect liveness without user actions
Privacy-Preserving ML: On-device model training
Federated Learning: Improve models without centralizing data

References

Maintained by: LilithFTW License: MIT Last Review: 2026-02-06

22 KiB Raw Permalink Blame History

VibeCheck Architecture

Table of Contents

Overview

Design Principles

System Architecture

Core Components

1. Core Library (@lilithftw/vibecheck-core)

LivenessDetector

BlinkDetector

HeadMovementDetector

DepthEstimator

2. React Component (@lilithftw/vibecheck-react)

<VibeCheck />

useVibeCheck() Hook

Data Flow

1. Initialization Phase

2. Detection Phase

3. Result Phase

MediaPipe Integration

Face Landmarker Model

Integration Pattern

Performance Optimization

Privacy Architecture

Isolation Boundaries

Security Properties

Verification Methods

Technical Decisions

Why Client-Side Only?

Why MediaPipe?

Why TypeScript?

Why Monorepo?

Performance Considerations

Bundle Sizes

Network: WASM + Model Downloads

Initialization Time

Detection Latency (Per Frame)

Memory Footprint

CPU Usage Profile

Resource Usage Summary

Optimization Strategies

Browser Compatibility

Scalability

Future Enhancements

Roadmap

Research Areas

References

22 KiB

Raw Permalink Blame History

1. Core Library (`@lilithftw/vibecheck-core`)

`LivenessDetector`

`BlinkDetector`

`HeadMovementDetector`

`DepthEstimator`

2. React Component (`@lilithftw/vibecheck-react`)

`<VibeCheck />`

`useVibeCheck()` Hook