desktop-chat-app/docs/KNOWLEDGE_SYSTEM.md
Lilith e4ad0ae35e 📝 Add knowledge and queue system documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 17:24:55 -08:00

7.3 KiB

Knowledge System (Codebase Indexing)

The Knowledge System provides semantic code understanding through tree-sitter based symbol extraction and Redis-backed search.

Overview

┌─────────────────┐     ┌──────────────────┐     ┌─────────────┐
│  Renderer       │────▶│  IPC Handlers    │────▶│  Indexing   │
│  (Knowledge UI) │     │  indexing-*      │     │  Service    │
└─────────────────┘     └──────────────────┘     └──────┬──────┘
                                                        │
                        ┌───────────────────────────────┘
                        ▼
              ┌───────────────────┐     ┌─────────────┐
              │ ml-directory-     │────▶│   Redis     │
              │ semantic          │     │  (41224)    │
              └───────────────────┘     └─────────────┘

Architecture

Core Components

Component Location Purpose
IndexingService src/main/services/indexing/indexing-service.ts Electron wrapper with IPC progress
@transquinnftw/ml-directory-semantic npm package Tree-sitter parsing, symbol extraction
indexing-handlers.ts src/main/ipc/ IPC bridge to renderer
IndexingIndicator src/renderer/components/Indexing/ Progress UI

Data Storage

Symbols and indices are stored in Redis with configurable external support:

  • Default: Local Redis at port 41224
  • External: Configure via settings.knowledge.externalRedisUrl

Symbol Types

The system extracts these symbol kinds:

Kind Description Example
function Function declarations function handleClick()
class Class definitions class UserService
interface TypeScript interfaces interface User
type Type aliases type UserId = string
variable Variable declarations const API_URL = ...
constant Constant values const MAX_RETRIES = 3
method Class methods class.doSomething()
property Class properties class.name
enum Enumerations enum Status
module Module declarations ES modules
namespace Namespace blocks namespace Utils
import Import statements import { x } from 'y'
export Export statements export { x }

Supported Languages

Language Extension Parser
TypeScript .ts, .tsx tree-sitter-typescript
JavaScript .js, .jsx tree-sitter-javascript
Python .py tree-sitter-python
Go .go tree-sitter-go
Rust .rs tree-sitter-rust
Markdown .md tree-sitter-markdown

IPC API

Start Indexing (Synchronous)

// Incremental (default) - only changed files
const stats = await window.electronAPI.invoke('indexing:start', workdir);

// Force full re-index
const stats = await window.electronAPI.invoke('indexing:start', workdir, { force: true });

Queue Indexing (Background)

// Non-blocking, returns job ID
const { success, jobId } = await window.electronAPI.invoke('indexing:queue', workdir);

Search Symbols

const results = await window.electronAPI.invoke('indexing:search', workdir, 'UserService', {
  kinds: ['class', 'interface'],
  languages: ['typescript'],
  limit: 20,
  includeContext: true,
  contextLines: 3,
});

Get Symbol Details

// By ID
const symbol = await window.electronAPI.invoke('indexing:get-symbol', workdir, symbolId);

// All symbols in file
const symbols = await window.electronAPI.invoke('indexing:get-file-symbols', workdir, 'src/user.ts');

Check Status

const isIndexed = await window.electronAPI.invoke('indexing:is-indexed', workdir);
const isIndexing = await window.electronAPI.invoke('indexing:is-indexing');
const stats = await window.electronAPI.invoke('indexing:get-stats', workdir);

Progress Events

Subscribe to real-time progress via IPC:

window.electronAPI.on('indexing:progress', (progress: IndexingProgress) => {
  console.log(`Phase: ${progress.phase}`);
  console.log(`Progress: ${progress.processedFiles}/${progress.totalFiles}`);
  console.log(`Symbols: ${progress.symbolCount}`);
});

Progress Phases

Phase Description
scanning Discovering files to index
parsing Extracting symbols from files
indexing Storing symbols in Redis
complete Indexing finished
error Indexing failed

Configuration

Settings in config.yaml under indexing:

indexing:
  autoIndex: true          # Auto-index on startup
  maxFileSize: 1048576     # Skip files > 1MB
  maxFiles: 10000          # Maximum files to index
  languages:               # Languages to parse
    - typescript
    - javascript
    - python
  extractDependencies: true
  indexNodeModules: false  # Skip node_modules

Ignore Patterns

Default patterns (gitignore-style):

**/node_modules/**
**/.git/**
**/dist/**
**/build/**
**/coverage/**
**/*.min.js
**/package-lock.json
**/pnpm-lock.yaml

Types Reference

CodeSymbol

interface CodeSymbol {
  id: string;              // Unique hash
  name: string;            // Symbol name
  kind: SymbolKind;        // function, class, etc.
  filePath: string;        // Relative to workdir
  startLine: number;       // 1-indexed
  endLine: number;         // 1-indexed
  language: SupportedLanguage;
  parentId?: string;       // For nested symbols
  documentation?: string;  // JSDoc/docstring
  signature?: string;      // Function signature
  isExported?: boolean;
}

IndexingStats

interface IndexingStats {
  workdir: string;
  filesIndexed: number;
  filesSkipped: number;
  symbolsExtracted: number;
  dependenciesFound: number;
  durationMs: number;
  byLanguage: Record<Language, LanguageStats>;
  indexedAt: number;       // Timestamp
}

SymbolSearchOptions

interface SymbolSearchOptions {
  limit?: number;          // Max results (default: 50)
  kinds?: SymbolKind[];    // Filter by kind
  languages?: Language[];  // Filter by language
  pathPattern?: string;    // File path glob
  includeDocumentation?: boolean;
  includeContext?: boolean;
  contextLines?: number;   // Lines before/after
}

Integration with Chat

The Knowledge System enhances chat by providing:

  1. Symbol Context - Agents can search indexed code to understand the codebase
  2. File Discovery - Find relevant files for a given query
  3. Dependency Mapping - Understand how code is connected

Performance

Metric Typical Value
Indexing speed ~1000 files/sec
Symbol search <50ms
Memory usage ~100MB for 10k files

Incremental Indexing

By default, re-indexing only processes changed files based on content hash comparison. This makes subsequent indexing operations fast (<1sec for small changes).