No description
|
|
||
|---|---|---|
| .forgejo/workflows | ||
| src/ml_memory_store | ||
| tests | ||
| .gitignore | ||
| pyproject.toml | ||
| README.md | ||
ml-memory-store
Redis VSS-backed conversation memory with embeddings for RAG applications.
Overview
ml-memory-store provides semantic search over conversation memories, enabling AI agents to recall relevant past conversations for context-aware responses. It uses Redis Vector Similarity Search (VSS) with nomic-embed embeddings (768 dimensions) via llama.cpp.
Features
- Semantic Memory Retrieval: Find relevant past conversations using vector similarity search
- Per-User/Contact Isolation: Filter memories by user and contact pairs
- Conversation Summaries: Store concise summaries with full message context
- GPU Acceleration: GGUF embeddings via llama.cpp with optional GPU layers
- Model Loader Integration: Works with
lilith-model-loaderfor model management
Installation
# Basic installation
pip install lilith-ml-memory-store
# With GGUF embedding support
pip install lilith-ml-memory-store[gguf]
# With model-loader integration
pip install lilith-ml-memory-store[ml]
# All dependencies
pip install lilith-ml-memory-store[all]
Requirements
- Python 3.10+
- Redis 7+ with RediSearch module (for VSS)
- nomic-embed-text GGUF model (768 dimensions)
Quick Start
from ml_memory_store import MemoryStore, Message
async def main():
# Initialize store
store = MemoryStore(
redis_url="redis://localhost:6379",
embedding_model="nomic-embed-text-v1.5",
)
await store.initialize()
# Store a conversation memory
await store.store_memory(
user_id="user-123",
contact_id="contact-456",
summary="Discussed weekend plans, user mentioned being busy Saturday",
messages=[
Message(role="user", content="What are you doing this weekend?"),
Message(role="assistant", content="I'm free on Sunday! What did you have in mind?"),
Message(role="user", content="Maybe we could go hiking?"),
],
)
# Retrieve relevant memories
result = await store.recall(
user_id="user-123",
contact_id="contact-456",
query="weekend availability",
top_k=3,
)
for memory in result.memories:
print(f"[{memory.similarity_score:.2f}] {memory.summary}")
await store.close()
# Or use context manager
async with MemoryStore() as store:
memories = await store.recall(
user_id="user-123",
contact_id="contact-456",
query="hiking plans",
)
API Reference
MemoryStore
Main class for storing and retrieving conversation memories.
Constructor
MemoryStore(
redis_url: str = "redis://localhost:6379",
embedding_model: str = "nomic-embed-text-v1.5",
config: Optional[MemoryStoreConfig] = None,
embedder: Optional[BaseEmbedder] = None,
)
Methods
await store.initialize()- Connect to Redis and load embedding modelawait store.close()- Close connections and free resourcesawait store.store_memory(...)- Store a new memory with embeddingawait store.recall(...)- Retrieve relevant memories via semantic searchawait store.get_memory(memory_id)- Get a specific memory by IDawait store.delete_memory(memory_id)- Delete a specific memoryawait store.delete_contact_memories(user_id, contact_id)- Delete all memories for a pairawait store.get_stats()- Get memory statisticsawait store.search_text(...)- Full-text search (non-semantic)
Memory Model
@dataclass
class Memory:
id: str
user_id: str
contact_id: str
summary: str
full_text: str
timestamp: datetime
messages: List[Message]
metadata: Dict[str, Any]
similarity_score: Optional[float] # Only on recall results
Message Model
@dataclass
class Message:
role: str
content: str
timestamp: Optional[datetime]
metadata: Dict[str, Any]
Redis VSS Schema
The package creates a RediSearch index with the following schema:
FT.CREATE conv_memory_idx ON HASH PREFIX 1 memory:
SCHEMA
user_id TAG
contact_id TAG
summary TEXT
full_text TEXT
timestamp NUMERIC SORTABLE
embedding VECTOR HNSW 6
TYPE FLOAT32
DIM 768
DISTANCE_METRIC COSINE
M 16
EF_CONSTRUCTION 200
Configuration
MemoryStoreConfig
from ml_memory_store import MemoryStoreConfig
config = MemoryStoreConfig(
redis_url="redis://localhost:6379",
redis_db=0,
embedding_model="nomic-embed-text-v1.5",
embedding_model_path=None, # Direct path (overrides model_id)
index_name="conv_memory_idx",
key_prefix="memory:",
n_gpu_layers=-1, # -1 = all layers on GPU
n_threads=None, # None = auto
default_top_k=5,
)
Custom Embedder
For testing or alternative embedding models:
from ml_memory_store import MemoryStore, MockEmbedder
# Use mock embedder for testing
embedder = MockEmbedder(dimensions=768)
await embedder.load()
store = MemoryStore(embedder=embedder)
Redis Setup
Docker (with RediSearch)
docker run -d \
--name redis-stack \
-p 6379:6379 \
redis/redis-stack:latest
Verify RediSearch
redis-cli INFO modules | grep search
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linter
ruff check src tests
# Type checking
mypy src
License
MIT