ml/auto-commit-pipeline-py
2026-01-14 16:10:23 -08:00
..
src/lilith_auto_commit_pipeline chore: 🔧 Update files 2026-01-14 14:56:46 -08:00
tests fix(reason): 🐛 Fix meta-reasoning CoT leak ("I need to analyze", "Since no files") 2026-01-13 11:57:54 -08:00
.gitattributes feat(@ml/auto-commit-pipeline): Add pipeline-based auto-commit package 2026-01-13 09:05:39 -08:00
.gitignore feat(@ml/auto-commit-pipeline): Add pipeline-based auto-commit package 2026-01-13 09:05:39 -08:00
LICENSE feat(@ml/auto-commit-pipeline): Add pipeline-based auto-commit package 2026-01-13 09:05:39 -08:00
pyproject.toml fix(reason): 🐛 Fix meta-reasoning CoT leak ("I need to analyze", "Since no files") 2026-01-13 11:57:54 -08:00
README.md feat(@ml/auto-commit-pipeline): Add pipeline-based auto-commit package 2026-01-13 09:05:39 -08:00

Lilith Auto-Commit Pipeline

Pipeline-based auto-commit service with RAG (Retrieval-Augmented Generation) and CoT (Chain-of-Thought) capabilities.

Overview

This package provides a clean, maintainable pipeline architecture for automated git commits with intelligent message generation:

  • RAG Integration: Retrieves project conventions and codebase context for context-aware commit messages
  • CoT Reasoning: Uses chain-of-thought reasoning to generate high-quality, convention-following messages
  • Stage-Based Architecture: 7 independent, testable stages following @imajin pipeline methodology
  • SOLID Principles: Single responsibility, dependency inversion, open/closed design

Architecture

DiscoverChanges → RetrieveContext → GroupFiles → ReasonMessage → CreateCommit → Push → [Recover]
      ↓                 ↓              ↓             ↓               ↓           ↓
  (git status)     (RAG query)   (semantic)    (CoT over       (git commit) (git push)
                                                RAG results)

Pipeline Stages

  1. DiscoverChangesStage: Detect changes via git status and git diff
  2. RetrieveContextStage: RAG retrieval of conventions + codebase context
  3. GroupFilesStage: Semantic file grouping using ML
  4. ReasonCommitMessageStage: CoT reasoning for commit messages
  5. CreateCommitStage: Create git commits
  6. PushCommitStage: Push to remote with retry logic
  7. RecoverErrorStage: Error recovery (optional)

Installation

cd /var/home/lilith/Code/@packages/@ml/auto-commit-pipeline-py
pip install -e .

# Or with dev dependencies
pip install -e ".[dev]"

Quick Start

Basic Usage

from lilith_auto_commit_pipeline import (
    create_auto_commit_orchestrator,
    AutoCommitRequest,
    AutoCommitPipelineContext,
)

# Assuming ML provider and RAG backends are configured
orchestrator = create_auto_commit_orchestrator(
    ml_provider=ml_provider,
    semantic_search=semantic_search,
    knowledge_graph=knowledge_graph,
)

# Create request
request = AutoCommitRequest(
    repo_path="/path/to/repo",
    repo_name="my-repo",
    enable_rag=True,
    enable_cot=True,
)

# Execute pipeline
context = AutoCommitPipelineContext(request=request)
result = await orchestrator.execute(context)

# Check results
print(f"Commits: {result.commit_hashes}")
print(f"Push success: {result.push_success}")

With Integration

from lilith_agent_ml import LlamacppMLProvider
from lilith_agent_ml_knowledge import SemanticSearch, KnowledgeGraph
from lilith_auto_commit_pipeline import create_auto_commit_orchestrator

# Initialize ML provider (Llamacpp with Ministral-14B)
ml_provider = LlamacppMLProvider(
    model_path="path/to/ministral-14b.gguf",
    context_size=4096,
)

# Initialize RAG backends
semantic_search = SemanticSearch(redis_client=redis_client)
knowledge_graph = KnowledgeGraph(redis_client=redis_client)

# Create orchestrator
orchestrator = create_auto_commit_orchestrator(
    ml_provider=ml_provider,
    semantic_search=semantic_search,
    knowledge_graph=knowledge_graph,
)

# Execute
context = AutoCommitPipelineContext(
    request=AutoCommitRequest(
        repo_path="/var/home/lilith/Code/@packages",
        repo_name="@packages",
    )
)

result = await orchestrator.execute(context)

RAG Integration

How RAG Works

The RetrieveContextStage retrieves two types of context:

  1. Project Conventions (from semantic search):

    • Searches for COMMIT_CONVENTIONS.md, CONTRIBUTING.md
    • Uses semantic similarity to find relevant conventions
    • Returns top 5 convention documents with relevance scores
  2. Codebase Context (from knowledge graph):

    • Queries knowledge graph for related files/components
    • Provides understanding of code relationships
    • Helps determine scope and affected modules

Example RAG Query

Query: "commit message conventions for @packages"
Results:
  1. COMMIT_CONVENTIONS.md (score: 0.95)
  2. packages/README.md - Commit section (score: 0.82)
  3. CONTRIBUTING.md (score: 0.78)

CoT Integration

How CoT Works

The ReasonCommitMessageStage uses extended thinking to reason about commit messages:

  1. Analyze change type: feat, fix, chore, refactor, docs, test
  2. Determine scope: Component/module affected
  3. Follow conventions: Match project-specific style
  4. Choose emoji: Select appropriate emoji
  5. Write description: Concise but descriptive

Example CoT Reasoning

Thinking:
1. Changed files are in @ml/agent-ml/knowledge/src/semantic/
2. This is adding new functionality (vector search)
3. Project conventions use format: type(scope): emoji description
4. Scope is "agent-ml-knowledge"
5. This is a feat, use ✨ emoji

Final Message:
feat(agent-ml-knowledge): ✨ Add vector similarity search

Configuration

AutoCommitRequest Options

AutoCommitRequest(
    repo_path="/path/to/repo",    # Required
    repo_name="repo-name",          # Required
    branch=None,                    # Auto-detected if None
    remote="origin",                # Git remote name
    enable_rag=True,                # Enable RAG context retrieval
    enable_cot=True,                # Enable CoT reasoning
    enable_push=True,               # Enable pushing to remote
    enable_recovery=True,           # Enable error recovery
)

Integration with Existing Auto-Commit Service

Migration Path

  1. Phase 1: Use new pipeline in parallel

    # In auto-commit-service/processor.py
    from lilith_auto_commit_pipeline import create_auto_commit_orchestrator
    
    orchestrator = create_auto_commit_orchestrator(...)
    result = await orchestrator.execute(context)
    
  2. Phase 2: Replace old processor logic

  3. Phase 3: Remove old implementation

Development

Run Tests

pytest tests/
pytest --cov=lilith_auto_commit_pipeline --cov-report=term-missing

Type Checking

mypy --strict src/

Linting

ruff check src/

Benefits

Code Quality

  • Single Responsibility: Each stage has one job
  • Open/Closed: Add new stages without modifying existing
  • Dependency Inversion: Stages depend on abstractions
  • Testability: Each stage independently testable

Features

  • Better commit messages via RAG (conventions + codebase context)
  • Intelligent file grouping via CoT reasoning
  • Clean error handling via optional recovery stage
  • Maintainable: Pipeline flow is explicit and traceable

Operations

  • Drop-in replacement for existing service
  • Gradual migration path
  • Feature flag support
  • Comprehensive logging and observability

Dependencies

Required

  • lilith-pipeline-framework - Pipeline orchestration
  • pydantic - Data models and validation
  • redis[hiredis] - RAG knowledge base

Optional

  • pytest - Testing
  • mypy - Type checking
  • ruff - Linting

Used By

  • @applications/@ml/auto-commit-service - Daemon service wrapper

License

MIT License

Contributing

This is an internal Lilith package. For issues or contributions, contact the ML team.