autocommit 8db4afe869 feat(cot-commit): ✨ Implement sophisticated COT reasoning logic for auto-commit decision triggers

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>

2026-04-17 21:20:13 -07:00

8.8 KiB

Raw Blame History

Auto-Commit Service Architecture

Overview

The auto-commit service monitors git repositories for uncommitted changes and automatically generates commit messages using a local LLM (llama-service).

Monitoring Scope

What Gets Monitored

The service monitors git repositories, not individual packages.

Metric	Count	Notes
Git repos in @packages	58	Excludes node_modules
Git repos in @applications	10	@audio, @image, @lilith, @ml
Total monitored	68

Package vs Repo Distinction

@packages/                    # Workspace root
├── @nestjs/                  # 1 git repo
│   ├── .git/
│   ├── auth/                 # package: @lilith/nestjs-auth
│   ├── bootstrap/            # package: @lilith/nestjs-bootstrap
│   └── health/               # package: @lilith/nestjs-health
└── @eslint/
    ├── config-base/          # 1 git repo, 1 package
    │   └── .git/
    └── config-react/         # 1 git repo, 1 package
        └── .git/

114 npm packages (package.json files)
26 Python packages (pyproject.toml files)
59 git repos (.git directories) - this is what gets monitored

Git commits happen at the repo level, so monitoring repos (not packages) is correct.

Configured Base Paths

repos_base_paths = [
    "/var/home/lilith/Code/@packages",
    "/var/home/lilith/Code/@applications/@audio",
    "/var/home/lilith/Code/@applications/@image",
    "/var/home/lilith/Code/@applications/@lilith",
    "/var/home/lilith/Code/@applications/@ml",
]

Discovery Process

For each base path, recursively find .git directories
Filter out excluded patterns: node_modules, .venv, dist, build, __pycache__
Respect recursive_depth limit (default: 4)
Deduplicate repos found in multiple paths

Service Dependencies

┌─────────────────────┐
│  auto-commit-service│ Port 8200
│  (scheduler/daemon) │
└─────────┬───────────┘
          │ HTTP
          ▼
┌─────────────────────┐
│   llama-http        │ Port 10010
│   (LLM inference)   │
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│ ministral-14b       │ reasoning model (analyze)
│ ministral-3b        │ instruct model (format)
└─────────────────────┘

The service uses a multi-model approach:

Reasoning model (ministral-14b): Deep analysis of code changes
Instruct model (ministral-3b): Fast commit message formatting

Cycle Flow

The service uses a per-repo atomic workflow:

┌─────────────────────────────────────────┐
│              CYCLE LOOP                 │
├─────────────────────────────────────────┤
│  repo-a: pipeline → push → done         │
│  repo-b: pipeline → push → done         │
│  repo-c: no changes → skip              │
│  repo-d: pipeline → push → done         │
│                 ↓                       │
│         All repos processed             │
│                 ↓                       │
│         Persist commit history          │
│                 ↓                       │
│           Sleep X seconds               │
│                 ↓                       │
│            Next cycle                   │
└─────────────────────────────────────────┘

Pipeline Stages

For each repo with uncommitted changes, a 6-stage pipeline processes the working directory changes:

┌─────────────────────────────────────────────────────────────────────┐
│                         COMMIT PIPELINE                             │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  1. DETECT     Find changed files in working directory              │
│       ↓        (uncommitted changes, not yet git-staged)            │
│                                                                     │
│  2. GROUP      Cluster related files into logical commit batches    │
│       ↓        (LLM groups by feature/purpose)                      │
│                                                                     │
│  3. ANALYZE    LLM reads each batch's diff to understand changes    │
│       ↓        (what does this code change do?)                     │
│                                                                     │
│  4. FORMAT     Generate commit message from analysis                │
│       ↓        (conventional commit format with emoji)              │
│                                                                     │
│  5. COMMIT     git add + git commit for each batch                  │
│       ↓        (files are staged and committed here)                │
│                                                                     │
│  6. PUSH       Push commits to remote                               │
│                (with conflict resolution if needed)                 │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Terminology note: "Analyzing commit 189/283" in logs means the LLM is analyzing the 189th batch of uncommitted changes. These are not yet git-staged or committed - that happens in stage 5.

Per-Repo Processing

For each repo:

Check git status --porcelain for uncommitted working directory changes
Skip if no changes
Run pipeline: detect → group → analyze → format → commit → push
Move to next repo

Cycle Completion

When all repos processed:

Log summary (committed, failed, unchanged)
Persist commit history
Sleep for cycle_interval_seconds (default: 60)
Start next cycle

Why Per-Repo Atomic?

Sloppy-atomic: Each repo is self-contained (commit+push)
Progress visible: Changes appear on remote as processed
Fail-isolated: One repo failing doesn't block others

Data Persistence

Commit history is persisted to survive daemon restarts:

File	Location	Purpose
History	`~/.cache/commits/history.json`	Last 100 commits (hash, repo, timestamp)
Activity	`~/.cache/commits/activity.jsonl`	Detailed activity log
Database	`~/.cache/commits/auto_commit.db`	SQLite for structured queries

Important: History is only persisted when a cycle completes. If the daemon is interrupted mid-cycle (stuck hook, crash, etc.), commits made during that cycle won't appear in history.

API Endpoints

Endpoint	Method	Purpose
`/health`	GET	Service health check
`/status`	GET	Current daemon status, last cycle results
`/repos`	GET	List all monitored repositories
`/trigger`	POST	Manually trigger a commit cycle
`/enable`	POST	Enable the daemon
`/disable`	POST	Disable the daemon
`/report/commits`	GET	View commit history
`/report/summary`	GET	Comprehensive daemon report

Configuration

Key settings in AutoCommitSettings:

Setting	Default	Description
`cycle_interval_seconds`	60	Time between commit cycles
`llama_model_id`	qwen2.5-1.5b-instruct	Model for commit messages
`recursive_depth`	4	Max depth for repo discovery
`git_remote`	origin	Remote to push to
`git_branch`	master	Branch to push

Existing scripts in @packages/scripts/ provide similar functionality:

Script	Purpose
`git/git-repo-status.sh`	Check status across all repos
`git/commit-all-dirty.sh`	Simple bulk commit (no LLM)
`git/git-push-all.sh`	Push all repos

The auto-commit service is the "AI-powered" version that generates better commit messages via LLM, while the scripts provide simpler manual alternatives.

8.8 KiB Raw Blame History