5.4 KiB
5.4 KiB
Text Processing Package Integration Plan
Status: Planning
Priority: Medium
Packages: @lilith/text-processing-utils, @lilith/text-processing-algorithms, @lilith/text-processing-content-flagging
Overview
The @text-processing workspace contains production-ready utilities that are currently underutilized. This plan documents how to identify opportunities to DRY code by integrating these packages.
Available Packages
| Package | Version | Purpose |
|---|---|---|
@lilith/text-processing-algorithms |
1.1.0 | String distance, phonetic matching, data structures |
@lilith/text-processing-utils |
1.2.4 | Spellcheck, sanitizers, validators, encoders |
@lilith/text-processing-content-flagging |
1.1.0 | React hooks/UI for content analysis |
algorithms - Core Functionality
// String distance
import { levenshtein, damerauLevenshtein } from '@lilith/text-processing-algorithms/distance';
// Phonetic matching
import { soundex, metaphone, doubleMetaphone } from '@lilith/text-processing-algorithms/phonetic';
// Data structures
import { Trie, BKTree } from '@lilith/text-processing-algorithms/data-structures';
text-utils - High-Level Utilities
src/
├── cache/ # Caching utilities
├── comparators/ # Text comparison
├── encoders/ # Text encoding
├── extractors/ # Content extraction
├── metrics/ # Text metrics
├── normalizers/ # Text normalization
├── patterns/ # Regex patterns
├── sanitizers/ # Input sanitization
├── spellcheck/ # Spellcheck engine
├── splitters/ # Text splitting
├── transformers/ # Text transformation
└── validators/ # Input validation
content-flagging - React Integration
import { useContentFlagging, useAutosaveWithFlagging } from '@lilith/text-processing-content-flagging';
import { ContentFlaggedField, FlagScoreIndicator } from '@lilith/text-processing-content-flagging';
How to Find DRY Opportunities
1. Search for Reimplemented Algorithms
# Find levenshtein reimplementations
grep -r "levenshtein\|editDistance\|edit.*distance" codebase/ --include="*.ts" | grep -v node_modules
# Find phonetic matching
grep -r "soundex\|metaphone\|phonetic" codebase/ --include="*.ts" | grep -v node_modules
# Find fuzzy search/matching
grep -r "fuzzy\|approximate.*match\|similarity" codebase/ --include="*.ts" | grep -v node_modules
2. Search for Text Validation Patterns
# Find email/URL/UUID validation
grep -r "validateEmail\|isValidUrl\|isValidUuid\|emailRegex" codebase/ --include="*.ts" | grep -v node_modules
# Find sanitization
grep -r "sanitize\|escapeHtml\|stripTags\|xss" codebase/ --include="*.ts" | grep -v node_modules
# Find normalization
grep -r "normalize\|toLowerCase.*trim\|whitespace" codebase/ --include="*.ts" | grep -v node_modules
3. Search for Spellcheck/Text Analysis
# Find spellcheck implementations
grep -r "spellcheck\|spell.*check\|dictionary\|suggestions" codebase/ --include="*.ts" | grep -v node_modules
# Find content moderation/flagging
grep -r "profanity\|content.*flag\|moderat" codebase/ --include="*.ts" | grep -v node_modules
4. Identify Large Utility Files
# Find large utility files that might contain reimplementations
find codebase/ -name "*util*" -o -name "*helper*" -o -name "*text*" | xargs wc -l 2>/dev/null | sort -n | tail -20
Integration Checklist
When integrating a package:
- Add to
package.json:pnpm add @lilith/text-processing-{package} - Replace local implementation with import
- Update tests to use package
- Remove local implementation file
- Update any type imports
- Verify behavior matches (packages have tests)
Discovery Results (2026-01-05)
lilith-platform: NO Opportunities Found
The collective ran comprehensive DRY discovery and found no reimplementations to replace:
| Search Pattern | Finding |
|---|---|
| levenshtein/similarity | Uses ML-based semantic similarity via @lilith/ml-directory-semantic |
| phonetic matching | Not implemented |
| sanitize/escapeHtml | Only 20-line slug sanitizer in @validation/core |
| validateEmail/isValidUrl | Uses class-validator library |
| spellcheck/dictionary | Not implemented |
| profanity/content flagging | Uses @lilith/truth-client service |
Conclusion: lilith-platform is architecturally clean. Use packages proactively for new features.
desktop-chat-app: 1 Concrete Opportunity
| File | Lines | Replacement |
|---|---|---|
BrowserSpellChecker.ts |
427 | @lilith/text-processing-utils spellcheck |
Package Location
All packages are in ~/Code/@packages/@text-processing/:
@text-processing/
├── algorithms/ # Core algorithms (clean, modern)
├── content-flagging/ # React hooks/UI (clean, modern)
└── text-utils/ # Utilities (legacy, 163 warnings, but functional)
All are published to forge.nasty.sh and can be consumed immediately.
Future Improvements
- Clean up text-utils - Address 163 lint warnings
- Document APIs - Add comprehensive API docs
- Add to package catalog - Improve discoverability
- Integration examples - Add usage examples to each package
Created: 2026-01-05 Author: The Collective