189 lines
9.4 KiB
Markdown
189 lines
9.4 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
# Install
|
|
uv pip install -e . # basic
|
|
uv pip install -e ".[dev]" # with test deps
|
|
|
|
# Tests
|
|
pytest # unit/smoke tests (GPU tests excluded by default)
|
|
pytest tests/test_daemon.py -v # single file
|
|
pytest tests/test_daemon.py::TestDaemon::test_start -v # single test
|
|
pytest -m gpu -v # GPU integration tests (needs model-boss coordinator running)
|
|
pytest --cov=auto_commit_service # with coverage
|
|
|
|
# Lint/type check
|
|
ruff check src/ tests/
|
|
ruff format src/ tests/
|
|
mypy src/
|
|
|
|
# Run service
|
|
python -m auto_commit_service # direct entry
|
|
commits start 5m -R # CLI: daemon with 5m cycle, recursive discovery
|
|
commits status --all # check all running daemons
|
|
|
|
# Systemd
|
|
systemctl --user restart auto-commit-applications.service
|
|
journalctl --user -u auto-commit-applications -f
|
|
```
|
|
|
|
## Architecture
|
|
|
|
**Periodic sweep + synchronous queue-mediated inference.** The daemon loops every `cycle_interval_seconds` (default 180s), processes repos sequentially, and each LLM call blocks on model-boss coordinator's queue. The `commits-tray` macOS app can also act as a remote commit agent: with `--commit-local`, it scans repos on the local Mac, forwards diffs to apricot's `/generate-message` endpoint for LLM inference, commits and pushes locally, and reports back via `/record-commit` — no second daemon needed on the Mac.
|
|
|
|
### Execution flow
|
|
|
|
```
|
|
CommitDaemon.start() — main loop
|
|
for each dirty repo:
|
|
PipelineCommitProcessor.commit_repo()
|
|
→ Pipeline orchestrator (11 stages):
|
|
PreFilter → Discover → Retrieve(RAG) → Group → Analyze(14B) → Format(3B)
|
|
→ Commit → Push → VersionDetect → PublishVerify → Recover
|
|
→ Each LLM stage calls MultiModelLlamaClient._chat()
|
|
→ InferenceClient.chat() → POST coordinator:8210/v1/chat/completions
|
|
→ model-boss coordinator queues and executes on GPU
|
|
sleep(cycle_interval_seconds)
|
|
```
|
|
|
|
### Two-model LLM pipeline
|
|
|
|
All inference routes through **model-boss coordinator** (port 8210). No direct model loading.
|
|
|
|
- **Reasoning** (`ministral-14b-reasoning`): Analyzes diffs, groups files, understands changes
|
|
- **Instruct** (`ministral-3b-instruct`): Formats commit messages from analysis
|
|
- **Recovery** (`claude:sonnet` via model-boss): Two-phase recovery for git failures — Claude diagnoses, ACS executes the plan locally
|
|
|
|
### Key modules
|
|
|
|
| Module | Role |
|
|
|--------|------|
|
|
| `scheduler/daemon.py` | Main loop, cycle orchestration, repo discovery |
|
|
| `scheduler/pipeline_processor.py` | Per-repo processing, monorepo submodule handling |
|
|
| `pipeline/orchestrator.py` | Creates the 11-stage pipeline chain |
|
|
| `pipeline/stages/` | Individual pipeline stages |
|
|
| `pipeline/init.py` | Global ML provider initialization (must call before pipeline) |
|
|
| `llm/multi_model_client.py` | Routes inference to model-boss via InferenceClient |
|
|
| `recovery/handlers.py` | Error classification → recovery strategy routing |
|
|
| `recovery/claude_fallback.py` | Two-phase Claude recovery (diagnose via model-boss, execute locally) |
|
|
| `database/` | Async SQLite (aiosqlite) for commit/cycle/error history |
|
|
| `cli/` | Typer CLI (`commits` command) for multi-daemon management |
|
|
| `app.py` | FastAPI factory with 20+ monitoring/control endpoints |
|
|
| `config.py` | All settings with `AUTO_COMMIT_` env prefix |
|
|
|
|
### External dependencies
|
|
|
|
- **model-boss coordinator** (port 8210): GPU model management, inference queue, VRAM scheduling
|
|
- **rag-retrieval** (optional): Contextual retrieval for commit analysis
|
|
- **git**: All operations via `asyncio.create_subprocess_exec` (no shell)
|
|
|
|
## Multi-host sync (plum ↔ apricot)
|
|
|
|
The same ~68 repos are checked out on two machines. ACS keeps them in sync **and
|
|
up to date** in both directions. Internalize this model before touching anything
|
|
that pushes, pulls, or discovers repos.
|
|
|
|
**Forgejo is the hub — there is no direct host-to-host git.** Both hosts push to
|
|
and pull from the same `origin` (`forge.nasty.sh:2222` / `forge.black.lan`).
|
|
plum talks to apricot only over HTTP (port 8200) for LLM message generation and
|
|
commit recording — never git-to-git.
|
|
|
|
### Roles
|
|
|
|
| Host | What runs | LLM | How |
|
|
|------|-----------|-----|-----|
|
|
| **apricot** (Fedora, primary) | ACS daemon, `auto-commit-applications.service` (port 8200) | local (model-boss :8210) | Full 11-stage pipeline per repo each cycle |
|
|
| **plum** (macOS MacBook) | `commits-tray --commit-local` LaunchAgent | none — forwards to apricot | Scans local repos, gets messages from apricot's `/generate-message`, commits+pushes locally, reports via `/record-commit` |
|
|
|
|
### Stay-up-to-date (the pull half) — runs every cycle, even with nothing to commit
|
|
|
|
- **apricot**: `pre_cycle_sync()` (`git/operations.py`) per repo — orphan-recover →
|
|
`fetch` → if behind **and clean** → `git pull --rebase`. **Dirty trees are never
|
|
pulled or stashed** (other agents may be mid-edit); the dirty changes commit this
|
|
cycle, push, and the *next* cycle pulls clean. Gated by `pre_cycle_pull=True`
|
|
(`config.py`, default on).
|
|
- **plum**: `_push_if_safe()` (`tray/local_agent.py`) — `git fetch --quiet` first,
|
|
then if clean-but-behind → `git merge --ff-only`. This path runs even when a repo
|
|
has no local changes, so plum's clean checkouts self-heal toward `origin`.
|
|
|
|
### Stay-in-sync (the push half)
|
|
|
|
- **apricot**: pipeline COMMIT → PUSH; on rejection, `git pull --rebase` then retry
|
|
(`pipeline/stages/push.py`).
|
|
- **plum**: secret **prefilter** strips denylisted files (`tray/prefilter.py`) →
|
|
stage *allowed paths only* (never blanket `git add -A`) → message from apricot →
|
|
commit → `_push_if_safe` push. Repos with a non-empty staging index are skipped
|
|
(don't clobber in-progress manual work).
|
|
|
|
### Divergence (both ahead and behind)
|
|
|
|
Neither host force-anything. Both hand off to Claude Code recovery — apricot via
|
|
`recovery/claude_fallback.py`, plum via `_invoke_claude_recovery` (with a stall
|
|
cooldown so a stuck repo isn't retried every cycle). Recovery commands are
|
|
allowlisted: no `--force`, `--hard`, `--no-verify`.
|
|
|
|
### Branch/remote are config-driven — and the two hosts MUST track the same branch
|
|
|
|
`git_remote` / `git_branch` come from settings/per-directory overrides; each repo
|
|
syncs whatever branch its checkout tracks. Don't assume `master`.
|
|
|
|
**Invariant — the hub only reconciles same-branch.** Forgejo never merges `main`
|
|
into `master`. If apricot's checkout is on `main` and plum's is on `master`, each
|
|
host commits to a *different* branch, both push/pull cleanly against the hub, and
|
|
the two branches **diverge permanently** — no error, no recovery, just silent
|
|
drift. The pull/push halves above keep two checkouts together *only* when they
|
|
track the same `origin/<branch>`. Verify branch parity across hosts
|
|
(`git rev-parse --abbrev-ref HEAD` on each) before trusting that a repo is in
|
|
sync; "ahead 0 / behind 0 vs upstream" on each host is **not** sufficient when the
|
|
upstreams differ.
|
|
|
|
### Known failure mode — keep these distinct
|
|
|
|
1. **Data repos** (the ~68 monitored checkouts): self-heal via the pull/push halves
|
|
above. Drift only if `pre_cycle_pull` is off (apricot) or `--commit-local` is
|
|
absent from plum's LaunchAgent — **`--commit-local` is OFF by default for
|
|
safety**, so a plum tray launched without it neither commits nor fast-forwards,
|
|
and its checkouts silently fall behind.
|
|
2. **The ACS tooling itself on plum** (the `commits-tray` code plum executes): this
|
|
is plum's *local checkout of this repo*, which can drift far behind apricot
|
|
(the source of the past "~1000-commit-stale" tray). It does **not** self-heal
|
|
like a data repo, because the stale code is what would do the healing. Mitigation
|
|
pattern: have plum run apricot's current tray code over SSH rather than its own
|
|
local copy. Don't conflate this with data-repo sync.
|
|
|
|
## Configuration
|
|
|
|
All settings via env vars prefixed `AUTO_COMMIT_` or `~/.config/commits/startup-config.json`.
|
|
|
|
Key settings: `REASONING_MODEL_ID`, `INSTRUCT_MODEL_ID`, `CYCLE_INTERVAL_SECONDS`, `CLAUDE_FALLBACK_ENABLED`, `CLAUDE_RECOVERY_MODEL`.
|
|
|
|
Per-directory git identity and push behavior via `directory_overrides` in config.
|
|
|
|
## Testing
|
|
|
|
- `asyncio_mode = "auto"` — all async tests run automatically
|
|
- GPU tests require model-boss coordinator running, marked `@pytest.mark.gpu`
|
|
- Fixtures: `temp_git_repo` (creates temp git repo), `mock_settings` (unit), `gpu_settings` (integration)
|
|
- ruff: line-length 100, select E,F,I,N,W,UP,B,C4,SIM,RUF,PTH,ERA
|
|
- mypy: strict mode, Python 3.11+
|
|
|
|
## Data paths
|
|
|
|
| Path | Contents |
|
|
|------|----------|
|
|
| `~/.cache/commits/auto_commit.db` | SQLite: commits, cycles, errors, repo status |
|
|
| `~/.cache/commits/activity.jsonl` | Activity log (JSON Lines) |
|
|
| `~/.cache/commits/auto-commit.log` | Rotated log file |
|
|
| `~/.config/commits/startup-config.json` | Daemon registry config |
|
|
| `~/.config/commits/daemons.json` | Running daemon instances |
|
|
|
|
## Important notes
|
|
|
|
- **Never commit from this repo** — ACS itself handles all commits across the workspace
|
|
- Pipeline stages access ML providers via globals initialized by `init_ml_providers()` — must be called before pipeline execution
|
|
- ACS uses `default_priority="batch"` (lowest) in model-boss queue, so interactive requests preempt it
|
|
- Recovery commands are validated against an allowlist (no `--force`, `--hard`, `--no-verify`)
|