🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.3 KiB
7.3 KiB
Knowledge System (Codebase Indexing)
The Knowledge System provides semantic code understanding through tree-sitter based symbol extraction and Redis-backed search.
Overview
┌─────────────────┐ ┌──────────────────┐ ┌─────────────┐
│ Renderer │────▶│ IPC Handlers │────▶│ Indexing │
│ (Knowledge UI) │ │ indexing-* │ │ Service │
└─────────────────┘ └──────────────────┘ └──────┬──────┘
│
┌───────────────────────────────┘
▼
┌───────────────────┐ ┌─────────────┐
│ ml-directory- │────▶│ Redis │
│ semantic │ │ (41224) │
└───────────────────┘ └─────────────┘
Architecture
Core Components
| Component | Location | Purpose |
|---|---|---|
IndexingService |
src/main/services/indexing/indexing-service.ts |
Electron wrapper with IPC progress |
@transquinnftw/ml-directory-semantic |
npm package | Tree-sitter parsing, symbol extraction |
indexing-handlers.ts |
src/main/ipc/ |
IPC bridge to renderer |
IndexingIndicator |
src/renderer/components/Indexing/ |
Progress UI |
Data Storage
Symbols and indices are stored in Redis with configurable external support:
- Default: Local Redis at port 41224
- External: Configure via
settings.knowledge.externalRedisUrl
Symbol Types
The system extracts these symbol kinds:
| Kind | Description | Example |
|---|---|---|
function |
Function declarations | function handleClick() |
class |
Class definitions | class UserService |
interface |
TypeScript interfaces | interface User |
type |
Type aliases | type UserId = string |
variable |
Variable declarations | const API_URL = ... |
constant |
Constant values | const MAX_RETRIES = 3 |
method |
Class methods | class.doSomething() |
property |
Class properties | class.name |
enum |
Enumerations | enum Status |
module |
Module declarations | ES modules |
namespace |
Namespace blocks | namespace Utils |
import |
Import statements | import { x } from 'y' |
export |
Export statements | export { x } |
Supported Languages
| Language | Extension | Parser |
|---|---|---|
| TypeScript | .ts, .tsx |
tree-sitter-typescript |
| JavaScript | .js, .jsx |
tree-sitter-javascript |
| Python | .py |
tree-sitter-python |
| Go | .go |
tree-sitter-go |
| Rust | .rs |
tree-sitter-rust |
| Markdown | .md |
tree-sitter-markdown |
IPC API
Start Indexing (Synchronous)
// Incremental (default) - only changed files
const stats = await window.electronAPI.invoke('indexing:start', workdir);
// Force full re-index
const stats = await window.electronAPI.invoke('indexing:start', workdir, { force: true });
Queue Indexing (Background)
// Non-blocking, returns job ID
const { success, jobId } = await window.electronAPI.invoke('indexing:queue', workdir);
Search Symbols
const results = await window.electronAPI.invoke('indexing:search', workdir, 'UserService', {
kinds: ['class', 'interface'],
languages: ['typescript'],
limit: 20,
includeContext: true,
contextLines: 3,
});
Get Symbol Details
// By ID
const symbol = await window.electronAPI.invoke('indexing:get-symbol', workdir, symbolId);
// All symbols in file
const symbols = await window.electronAPI.invoke('indexing:get-file-symbols', workdir, 'src/user.ts');
Check Status
const isIndexed = await window.electronAPI.invoke('indexing:is-indexed', workdir);
const isIndexing = await window.electronAPI.invoke('indexing:is-indexing');
const stats = await window.electronAPI.invoke('indexing:get-stats', workdir);
Progress Events
Subscribe to real-time progress via IPC:
window.electronAPI.on('indexing:progress', (progress: IndexingProgress) => {
console.log(`Phase: ${progress.phase}`);
console.log(`Progress: ${progress.processedFiles}/${progress.totalFiles}`);
console.log(`Symbols: ${progress.symbolCount}`);
});
Progress Phases
| Phase | Description |
|---|---|
scanning |
Discovering files to index |
parsing |
Extracting symbols from files |
indexing |
Storing symbols in Redis |
complete |
Indexing finished |
error |
Indexing failed |
Configuration
Settings in config.yaml under indexing:
indexing:
autoIndex: true # Auto-index on startup
maxFileSize: 1048576 # Skip files > 1MB
maxFiles: 10000 # Maximum files to index
languages: # Languages to parse
- typescript
- javascript
- python
extractDependencies: true
indexNodeModules: false # Skip node_modules
Ignore Patterns
Default patterns (gitignore-style):
**/node_modules/**
**/.git/**
**/dist/**
**/build/**
**/coverage/**
**/*.min.js
**/package-lock.json
**/pnpm-lock.yaml
Types Reference
CodeSymbol
interface CodeSymbol {
id: string; // Unique hash
name: string; // Symbol name
kind: SymbolKind; // function, class, etc.
filePath: string; // Relative to workdir
startLine: number; // 1-indexed
endLine: number; // 1-indexed
language: SupportedLanguage;
parentId?: string; // For nested symbols
documentation?: string; // JSDoc/docstring
signature?: string; // Function signature
isExported?: boolean;
}
IndexingStats
interface IndexingStats {
workdir: string;
filesIndexed: number;
filesSkipped: number;
symbolsExtracted: number;
dependenciesFound: number;
durationMs: number;
byLanguage: Record<Language, LanguageStats>;
indexedAt: number; // Timestamp
}
SymbolSearchOptions
interface SymbolSearchOptions {
limit?: number; // Max results (default: 50)
kinds?: SymbolKind[]; // Filter by kind
languages?: Language[]; // Filter by language
pathPattern?: string; // File path glob
includeDocumentation?: boolean;
includeContext?: boolean;
contextLines?: number; // Lines before/after
}
Integration with Chat
The Knowledge System enhances chat by providing:
- Symbol Context - Agents can search indexed code to understand the codebase
- File Discovery - Find relevant files for a given query
- Dependency Mapping - Understand how code is connected
Performance
| Metric | Typical Value |
|---|---|
| Indexing speed | ~1000 files/sec |
| Symbol search | <50ms |
| Memory usage | ~100MB for 10k files |
Incremental Indexing
By default, re-indexing only processes changed files based on content hash comparison. This makes subsequent indexing operations fast (<1sec for small changes).