autocommit 2bfd968f1a docs(docs): 📝 Add branch synchronization requirements and failure modes documentation to CLAUDE.md

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>

2026-06-09 19:22:47 -07:00

9.4 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Commands

# Install
uv pip install -e .              # basic
uv pip install -e ".[dev]"       # with test deps

# Tests
pytest                            # unit/smoke tests (GPU tests excluded by default)
pytest tests/test_daemon.py -v    # single file
pytest tests/test_daemon.py::TestDaemon::test_start -v  # single test
pytest -m gpu -v                  # GPU integration tests (needs model-boss coordinator running)
pytest --cov=auto_commit_service  # with coverage

# Lint/type check
ruff check src/ tests/
ruff format src/ tests/
mypy src/

# Run service
python -m auto_commit_service     # direct entry
commits start 5m -R               # CLI: daemon with 5m cycle, recursive discovery
commits status --all              # check all running daemons

# Systemd
systemctl --user restart auto-commit-applications.service
journalctl --user -u auto-commit-applications -f

Architecture

Periodic sweep + synchronous queue-mediated inference. The daemon loops every cycle_interval_seconds (default 180s), processes repos sequentially, and each LLM call blocks on model-boss coordinator's queue. The commits-tray macOS app can also act as a remote commit agent: with --commit-local, it scans repos on the local Mac, forwards diffs to apricot's /generate-message endpoint for LLM inference, commits and pushes locally, and reports back via /record-commit — no second daemon needed on the Mac.

Execution flow

CommitDaemon.start() — main loop
  for each dirty repo:
    PipelineCommitProcessor.commit_repo()
      → Pipeline orchestrator (11 stages):
        PreFilter → Discover → Retrieve(RAG) → Group → Analyze(14B) → Format(3B)
        → Commit → Push → VersionDetect → PublishVerify → Recover
      → Each LLM stage calls MultiModelLlamaClient._chat()
        → InferenceClient.chat() → POST coordinator:8210/v1/chat/completions
        → model-boss coordinator queues and executes on GPU
  sleep(cycle_interval_seconds)

Two-model LLM pipeline

All inference routes through model-boss coordinator (port 8210). No direct model loading.

Reasoning (ministral-14b-reasoning): Analyzes diffs, groups files, understands changes
Instruct (ministral-3b-instruct): Formats commit messages from analysis
Recovery (claude:sonnet via model-boss): Two-phase recovery for git failures — Claude diagnoses, ACS executes the plan locally

Key modules

Module	Role
`scheduler/daemon.py`	Main loop, cycle orchestration, repo discovery
`scheduler/pipeline_processor.py`	Per-repo processing, monorepo submodule handling
`pipeline/orchestrator.py`	Creates the 11-stage pipeline chain
`pipeline/stages/`	Individual pipeline stages
`pipeline/init.py`	Global ML provider initialization (must call before pipeline)
`llm/multi_model_client.py`	Routes inference to model-boss via InferenceClient
`recovery/handlers.py`	Error classification → recovery strategy routing
`recovery/claude_fallback.py`	Two-phase Claude recovery (diagnose via model-boss, execute locally)
`database/`	Async SQLite (aiosqlite) for commit/cycle/error history
`cli/`	Typer CLI (`commits` command) for multi-daemon management
`app.py`	FastAPI factory with 20+ monitoring/control endpoints
`config.py`	All settings with `AUTO_COMMIT_` env prefix

External dependencies

model-boss coordinator (port 8210): GPU model management, inference queue, VRAM scheduling
rag-retrieval (optional): Contextual retrieval for commit analysis
git: All operations via asyncio.create_subprocess_exec (no shell)

Multi-host sync (plum ↔ apricot)

The same ~68 repos are checked out on two machines. ACS keeps them in sync and up to date in both directions. Internalize this model before touching anything that pushes, pulls, or discovers repos.

Forgejo is the hub — there is no direct host-to-host git. Both hosts push to and pull from the same origin (forge.nasty.sh:2222 / forge.black.local). plum talks to apricot only over HTTP (port 8200) for LLM message generation and commit recording — never git-to-git.

Roles

Host	What runs	LLM	How
apricot (Fedora, primary)	ACS daemon, `auto-commit-applications.service` (port 8200)	local (model-boss :8210)	Full 11-stage pipeline per repo each cycle
plum (macOS MacBook)	`commits-tray --commit-local` LaunchAgent	none — forwards to apricot	Scans local repos, gets messages from apricot's `/generate-message`, commits+pushes locally, reports via `/record-commit`

Stay-up-to-date (the pull half) — runs every cycle, even with nothing to commit

apricot: pre_cycle_sync() (git/operations.py) per repo — orphan-recover → fetch → if behind and clean → git pull --rebase. Dirty trees are never pulled or stashed (other agents may be mid-edit); the dirty changes commit this cycle, push, and the next cycle pulls clean. Gated by pre_cycle_pull=True (config.py, default on).
plum: _push_if_safe() (tray/local_agent.py) — git fetch --quiet first, then if clean-but-behind → git merge --ff-only. This path runs even when a repo has no local changes, so plum's clean checkouts self-heal toward origin.

Stay-in-sync (the push half)

apricot: pipeline COMMIT → PUSH; on rejection, git pull --rebase then retry (pipeline/stages/push.py).
plum: secret prefilter strips denylisted files (tray/prefilter.py) → stage allowed paths only (never blanket git add -A) → message from apricot → commit → _push_if_safe push. Repos with a non-empty staging index are skipped (don't clobber in-progress manual work).

Divergence (both ahead and behind)

Neither host force-anything. Both hand off to Claude Code recovery — apricot via recovery/claude_fallback.py, plum via _invoke_claude_recovery (with a stall cooldown so a stuck repo isn't retried every cycle). Recovery commands are allowlisted: no --force, --hard, --no-verify.

Branch/remote are config-driven — and the two hosts MUST track the same branch

git_remote / git_branch come from settings/per-directory overrides; each repo syncs whatever branch its checkout tracks. Don't assume master.

Invariant — the hub only reconciles same-branch. Forgejo never merges main into master. If apricot's checkout is on main and plum's is on master, each host commits to a different branch, both push/pull cleanly against the hub, and the two branches diverge permanently — no error, no recovery, just silent drift. The pull/push halves above keep two checkouts together only when they track the same origin/<branch>. Verify branch parity across hosts (git rev-parse --abbrev-ref HEAD on each) before trusting that a repo is in sync; "ahead 0 / behind 0 vs upstream" on each host is not sufficient when the upstreams differ.

Known failure mode — keep these distinct

Data repos (the ~68 monitored checkouts): self-heal via the pull/push halves above. Drift only if pre_cycle_pull is off (apricot) or --commit-local is absent from plum's LaunchAgent — --commit-local is OFF by default for safety, so a plum tray launched without it neither commits nor fast-forwards, and its checkouts silently fall behind.
The ACS tooling itself on plum (the commits-tray code plum executes): this is plum's local checkout of this repo, which can drift far behind apricot (the source of the past "~1000-commit-stale" tray). It does not self-heal like a data repo, because the stale code is what would do the healing. Mitigation pattern: have plum run apricot's current tray code over SSH rather than its own local copy. Don't conflate this with data-repo sync.

Configuration

All settings via env vars prefixed AUTO_COMMIT_ or ~/.config/commits/startup-config.json.

Key settings: REASONING_MODEL_ID, INSTRUCT_MODEL_ID, CYCLE_INTERVAL_SECONDS, CLAUDE_FALLBACK_ENABLED, CLAUDE_RECOVERY_MODEL.

Per-directory git identity and push behavior via directory_overrides in config.

Testing

asyncio_mode = "auto" — all async tests run automatically
GPU tests require model-boss coordinator running, marked @pytest.mark.gpu
Fixtures: temp_git_repo (creates temp git repo), mock_settings (unit), gpu_settings (integration)
ruff: line-length 100, select E,F,I,N,W,UP,B,C4,SIM,RUF,PTH,ERA
mypy: strict mode, Python 3.11+

Data paths

Path	Contents
`~/.cache/commits/auto_commit.db`	SQLite: commits, cycles, errors, repo status
`~/.cache/commits/activity.jsonl`	Activity log (JSON Lines)
`~/.cache/commits/auto-commit.log`	Rotated log file
`~/.config/commits/startup-config.json`	Daemon registry config
`~/.config/commits/daemons.json`	Running daemon instances

Important notes

Never commit from this repo — ACS itself handles all commits across the workspace
Pipeline stages access ML providers via globals initialized by init_ml_providers() — must be called before pipeline execution
ACS uses default_priority="batch" (lowest) in model-boss queue, so interactive requests preempt it
Recovery commands are validated against an allowlist (no --force, --hard, --no-verify)

9.4 KiB Raw Blame History