No description
|
Some checks failed
Publish / publish (push) Failing after 0s
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| src/redis_vector_search | ||
| tests | ||
| .gitignore | ||
| pyproject.toml | ||
| README.md | ||
lilith-redis-vector-search
Redis-backed vector search using RediSearch HNSW indexing.
Installation
pip install lilith-redis-vector-search
Quick Start
from redis_vector_search import (
SemanticSearch,
VectorStore,
VectorIndexConfig,
HttpEmbeddingProvider,
get_redis_client,
)
async def main():
# Connect to Redis
redis = await get_redis_client()
# Create vector store with defaults
vector_store = VectorStore(redis)
# Create embedding provider
embedder = HttpEmbeddingProvider(
url="http://localhost:8400/embed",
dimensions=384,
)
# Create search service
search = SemanticSearch(vector_store, embedder)
# Ensure index exists
await vector_store.create_index()
# Index documents
await search.index_document(
doc_id="docs/intro.md",
content="Welcome to the documentation...",
metadata={"source": "file"},
)
# Search
results = await search.search("documentation")
for result in results:
print(f"{result.doc_id}: {result.score:.2f}")
Custom Schema
Applications can configure their own schema for independence:
from redis_vector_search import VectorStore, VectorIndexConfig, FieldNames
config = VectorIndexConfig(
name="myapp:idx:docs",
prefix="myapp:doc:",
dimensions=768, # nomic-embed-text-v1.5
fields=FieldNames(
doc_id="path",
chunk_text="content",
position="chunk_index",
),
)
vector_store = VectorStore(redis, config)
Features
- HNSW Vector Search: Fast approximate nearest neighbor via RediSearch
- Configurable Schema: Applications define their own index/prefix/fields
- Chunking: Automatic document chunking with configurable overlap
- Batch Operations: Efficient batch indexing with Redis pipelines
- Filtering: Filter by document IDs or minimum score
API Reference
SemanticSearch
class SemanticSearch:
async def search(
query: str,
limit: int = 10,
min_score: float = 0.0,
doc_ids: list[str] | None = None,
) -> list[SemanticSearchResult]
async def index_document(
doc_id: str,
content: str,
metadata: dict | None = None,
) -> int # Returns chunk count
async def remove_document(doc_id: str) -> None
async def find_similar(doc_id: str, limit: int = 10) -> list[SemanticSearchResult]
VectorStore
class VectorStore:
async def store(chunk: TextChunk) -> None
async def store_batch(chunks: list[TextChunk]) -> None
async def search(embedding: list[float], limit: int) -> list[SearchResult]
async def get_chunk(chunk_id: str) -> TextChunk | None
async def delete(doc_id: str) -> None
async def create_index() -> None
Embedding Providers
# HTTP-based provider (production)
embedder = HttpEmbeddingProvider(
url="http://embeddings-service/embed",
dimensions=384,
)
# Mock provider (testing)
from redis_vector_search.embeddings import MockEmbeddingProvider
embedder = MockEmbeddingProvider(dimensions=384)
Configuration
Environment variables:
REDIS_URL: Redis connection URL (default:redis://localhost:6379/0)REDIS_HOST: Redis host (default:localhost)REDIS_PORT: Redis port (default:6379)REDIS_DB: Redis database (default:0)
Schema Documentation
See @packages/@ml/REDIS_VECTOR_SEARCH_SCHEMA.md for detailed schema documentation.
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Type checking
mypy src/
License
MIT