206 lines
8.8 KiB
Markdown
206 lines
8.8 KiB
Markdown
# Auto-Commit Service Architecture
|
|
|
|
## Overview
|
|
|
|
The auto-commit service monitors git repositories for uncommitted changes and automatically generates commit messages using a local LLM (llama-service).
|
|
|
|
## Monitoring Scope
|
|
|
|
### What Gets Monitored
|
|
|
|
The service monitors **git repositories**, not individual packages.
|
|
|
|
| Metric | Count | Notes |
|
|
|--------|-------|-------|
|
|
| Git repos in @packages | 58 | Excludes node_modules |
|
|
| Git repos in @applications | 10 | @audio, @image, @lilith, @ml |
|
|
| **Total monitored** | **68** | |
|
|
|
|
### Package vs Repo Distinction
|
|
|
|
```
|
|
@packages/ # Workspace root
|
|
├── @nestjs/ # 1 git repo
|
|
│ ├── .git/
|
|
│ ├── auth/ # package: @lilith/nestjs-auth
|
|
│ ├── bootstrap/ # package: @lilith/nestjs-bootstrap
|
|
│ └── health/ # package: @lilith/nestjs-health
|
|
└── @eslint/
|
|
├── config-base/ # 1 git repo, 1 package
|
|
│ └── .git/
|
|
└── config-react/ # 1 git repo, 1 package
|
|
└── .git/
|
|
```
|
|
|
|
- **114 npm packages** (`package.json` files)
|
|
- **26 Python packages** (`pyproject.toml` files)
|
|
- **59 git repos** (`.git` directories) - this is what gets monitored
|
|
|
|
Git commits happen at the repo level, so monitoring repos (not packages) is correct.
|
|
|
|
## Configured Base Paths
|
|
|
|
```python
|
|
repos_base_paths = [
|
|
"/var/home/lilith/Code/@packages",
|
|
"/var/home/lilith/Code/@applications/@audio",
|
|
"/var/home/lilith/Code/@applications/@image",
|
|
"/var/home/lilith/Code/@applications/@lilith",
|
|
"/var/home/lilith/Code/@applications/@ml",
|
|
]
|
|
```
|
|
|
|
## Discovery Process
|
|
|
|
1. For each base path, recursively find `.git` directories
|
|
2. Filter out excluded patterns: `node_modules`, `.venv`, `dist`, `build`, `__pycache__`
|
|
3. Respect `recursive_depth` limit (default: 4)
|
|
4. Deduplicate repos found in multiple paths
|
|
|
|
## Service Dependencies
|
|
|
|
```
|
|
┌─────────────────────┐
|
|
│ auto-commit-service│ Port 8200
|
|
│ (scheduler/daemon) │
|
|
└─────────┬───────────┘
|
|
│ HTTP
|
|
▼
|
|
┌─────────────────────┐
|
|
│ llama-http │ Port 10010
|
|
│ (LLM inference) │
|
|
└─────────┬───────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────┐
|
|
│ ministral-14b │ reasoning model (analyze)
|
|
│ ministral-3b │ instruct model (format)
|
|
└─────────────────────┘
|
|
```
|
|
|
|
The service uses a multi-model approach:
|
|
- **Reasoning model** (ministral-14b): Deep analysis of code changes
|
|
- **Instruct model** (ministral-3b): Fast commit message formatting
|
|
|
|
## Cycle Flow
|
|
|
|
The service uses a **per-repo atomic workflow**:
|
|
|
|
```
|
|
┌─────────────────────────────────────────┐
|
|
│ CYCLE LOOP │
|
|
├─────────────────────────────────────────┤
|
|
│ repo-a: pipeline → push → done │
|
|
│ repo-b: pipeline → push → done │
|
|
│ repo-c: no changes → skip │
|
|
│ repo-d: pipeline → push → done │
|
|
│ ↓ │
|
|
│ All repos processed │
|
|
│ ↓ │
|
|
│ Persist commit history │
|
|
│ ↓ │
|
|
│ Sleep X seconds │
|
|
│ ↓ │
|
|
│ Next cycle │
|
|
└─────────────────────────────────────────┘
|
|
```
|
|
|
|
### Pipeline Stages
|
|
|
|
For each repo with uncommitted changes, a 6-stage pipeline processes the working directory changes:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ COMMIT PIPELINE │
|
|
├─────────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ 1. DETECT Find changed files in working directory │
|
|
│ ↓ (uncommitted changes, not yet git-staged) │
|
|
│ │
|
|
│ 2. GROUP Cluster related files into logical commit batches │
|
|
│ ↓ (LLM groups by feature/purpose) │
|
|
│ │
|
|
│ 3. ANALYZE LLM reads each batch's diff to understand changes │
|
|
│ ↓ (what does this code change do?) │
|
|
│ │
|
|
│ 4. FORMAT Generate commit message from analysis │
|
|
│ ↓ (conventional commit format with emoji) │
|
|
│ │
|
|
│ 5. COMMIT git add + git commit for each batch │
|
|
│ ↓ (files are staged and committed here) │
|
|
│ │
|
|
│ 6. PUSH Push commits to remote │
|
|
│ (with conflict resolution if needed) │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Terminology note**: "Analyzing commit 189/283" in logs means the LLM is analyzing the 189th batch of uncommitted changes. These are not yet git-staged or committed - that happens in stage 5.
|
|
|
|
### Per-Repo Processing
|
|
For each repo:
|
|
1. Check `git status --porcelain` for uncommitted working directory changes
|
|
2. Skip if no changes
|
|
3. Run pipeline: detect → group → analyze → format → commit → push
|
|
4. Move to next repo
|
|
|
|
### Cycle Completion
|
|
When all repos processed:
|
|
- Log summary (committed, failed, unchanged)
|
|
- Persist commit history
|
|
- Sleep for `cycle_interval_seconds` (default: 60)
|
|
- Start next cycle
|
|
|
|
### Why Per-Repo Atomic?
|
|
- **Sloppy-atomic**: Each repo is self-contained (commit+push)
|
|
- **Progress visible**: Changes appear on remote as processed
|
|
- **Fail-isolated**: One repo failing doesn't block others
|
|
|
|
## Data Persistence
|
|
|
|
Commit history is persisted to survive daemon restarts:
|
|
|
|
| File | Location | Purpose |
|
|
|------|----------|---------|
|
|
| History | `~/.cache/commits/history.json` | Last 100 commits (hash, repo, timestamp) |
|
|
| Activity | `~/.cache/commits/activity.jsonl` | Detailed activity log |
|
|
| Database | `~/.cache/commits/auto_commit.db` | SQLite for structured queries |
|
|
|
|
**Important**: History is only persisted when a cycle completes. If the daemon is interrupted mid-cycle (stuck hook, crash, etc.), commits made during that cycle won't appear in history.
|
|
|
|
## API Endpoints
|
|
|
|
| Endpoint | Method | Purpose |
|
|
|----------|--------|---------|
|
|
| `/health` | GET | Service health check |
|
|
| `/status` | GET | Current daemon status, last cycle results |
|
|
| `/repos` | GET | List all monitored repositories |
|
|
| `/trigger` | POST | Manually trigger a commit cycle |
|
|
| `/enable` | POST | Enable the daemon |
|
|
| `/disable` | POST | Disable the daemon |
|
|
| `/report/commits` | GET | View commit history |
|
|
| `/report/summary` | GET | Comprehensive daemon report |
|
|
|
|
## Configuration
|
|
|
|
Key settings in `AutoCommitSettings`:
|
|
|
|
| Setting | Default | Description |
|
|
|---------|---------|-------------|
|
|
| `cycle_interval_seconds` | 60 | Time between commit cycles |
|
|
| `llama_model_id` | qwen2.5-1.5b-instruct | Model for commit messages |
|
|
| `recursive_depth` | 4 | Max depth for repo discovery |
|
|
| `git_remote` | origin | Remote to push to |
|
|
| `git_branch` | master | Branch to push |
|
|
|
|
## Related Scripts
|
|
|
|
Existing scripts in `@packages/scripts/` provide similar functionality:
|
|
|
|
| Script | Purpose |
|
|
|--------|---------|
|
|
| `git/git-repo-status.sh` | Check status across all repos |
|
|
| `git/commit-all-dirty.sh` | Simple bulk commit (no LLM) |
|
|
| `git/git-push-all.sh` | Push all repos |
|
|
|
|
The auto-commit service is the "AI-powered" version that generates better commit messages via LLM, while the scripts provide simpler manual alternatives.
|