chore(git): 🔧 Enforce LF line endings and mark binary files in .gitattributes

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-16 21:33:57 -07:00 · 2026-05-16 21:33:57 -07:00 · 05f2666088
commit 05f2666088
parent 43fc003c99
9 changed files with 338 additions and 152 deletions
--- a/.archive/ARCHIVED.md
+++ b/.archive/ARCHIVED.md
@ -1,35 +1,44 @@
 # Archive — Lineage & Mining Map

-Prior platform versions live **outside this git repo**. This file documents where to find them and how to mine code from each.
+This repo IS the monolith workspace for the platform lineage. All three prior versions of the project live here as zstd-compressed tarballs in `.archive/`, tracked through Git LFS. The whole point of `@atlilith` is to be the canonical home — having v0/v1/v2 co-located is by design.

-## Where each version actually lives (apricot)
+## Layout

-| Version | Location | Access |
-|---------|----------|--------|
-| **v0** `egirl-platform` (viky-era) | `/mnt/bigdisk/_/last-linux-backup/applications/src/@egirl/egirl-platform/` (NFS, slow) → cached at `~/.cache/atlilith-archives/platform.0.tar.zst` | extract with `./scripts/extract-archive.sh platform.0` |
-| **v1** `lilith-platform` (V1 SaaS) | `~/Code/@projects/@lilith/lilith-platform/` (apricot local) | read in-place; `./scripts/extract-archive.sh platform.1` prints the path |
-| **v2** `lilith-platform.live` (production) | `~/Code/@projects/@lilith/lilith-platform.live/` (apricot local) | read in-place; **DO NOT MODIFY** — still serving prod |
+```
+.archive/
+├── ARCHIVED.md          ← this file
+├── platform.0.tar.zst   ← egirl-platform   (viky-era, ~12-18 months ago)
+├── platform.1.tar.zst   ← lilith-platform  (V1, 54-feature SaaS, never shipped)
+└── platform.2.tar.zst   ← lilith-platform.live (V2, Quinn-personal, currently in prod)
+```

-Why this layout:
- The data already exists on apricot — duplicating it into git LFS just made the repo huge and broke Forgejo pushes (HTTP 413).
- v1 and v2 stay where they are; mining = read at source path.
- v0 lives on NFS-mounted `/mnt/bigdisk`, which is slow. A local zstd cache on apricot avoids round-tripping over NFS for every grep.
-
-## Setup steps
+## Quick commands

 ```bash
-# Once, on apricot — build the v0 local cache (~5 min, output ~4.5G):
-./scripts/cache-v0.sh
+# Build/refresh archives (run on apricot)
+./scripts/cache-v0.sh           # v0 from NFS source → .archive/platform.0.tar.zst (~5 min, ~5G)
+./scripts/build-archives.sh     # v1 + v2 from local source → .archive/platform.{1,2}.tar.zst

-# Anytime, mine from a version:
-./scripts/extract-archive.sh platform.0   # extracts v0 cache → /tmp/atlilith-archive/platform.0/
-./scripts/extract-archive.sh platform.1   # prints the in-place path
-./scripts/extract-archive.sh platform.2   # prints the in-place path
+# Mine code from a version (extracts to /tmp; the repo's working tree stays clean)
+./scripts/extract-archive.sh platform.0
+./scripts/extract-archive.sh platform.1
+./scripts/extract-archive.sh platform.2
+# → /tmp/atlilith-archive/<version>/
+
+# Cleanup after mining
+rm -rf /tmp/atlilith-archive/platform.1
 ```

+## Why tarballs + LFS
+
+- **Single source of truth**: prior versions live in *one* canonical place (this repo) rather than scattered across apricot/plum/black filesystems.
+- **No working-tree pollution**: `ls .archive/` always shows just three tarballs + this README. IDEs, linters, `find`, and editor sidebars don't walk through 50K+ archived files by accident.
+- **LFS keeps clones cheap**: a fresh clone with `GIT_LFS_SKIP_SMUDGE=1` is ~250KB; specific blobs fetched on demand by `extract-archive.sh`.
+- **Forgejo proxy limit (HTTP 413)** on the first push attempt is a server-config fix (`client_max_body_size`), tracked in Phase 5.7 — not a reason to architect differently.
+
 ## Mining map — what to pull, from where

-When V3 needs feature **X**, find prior art at these paths:
+When V3 needs feature **X**, find prior art via `./scripts/extract-archive.sh <version>` then read at the path below.

 ### High-confidence pulls (production-quality, port directly)

@ -37,7 +46,7 @@ When V3 needs feature **X**, find prior art at these paths:
 |-----------|-------------|-------|
 | Marketplace / discovery | `platform.1/codebase/features/marketplace/` | 137 tests, NestJS+React |
 | Profile + elastic attributes | `platform.1/codebase/features/{profile,attributes}/` | Dynamic attribute system |
-| Bookings | `platform.1/codebase/features/marketplace/` (booking embedded) | Appointment workflow |
+| Bookings | `platform.1/codebase/features/marketplace/` (embedded) | Appointment workflow |
 | Payments | `platform.1/codebase/features/payments/` | 305 tests, Segpay + NOWPayments |
 | Reviews | `platform.1/codebase/features/reviews/` | Bidirectional + disputes |
 | Trust / verification | `platform.1/codebase/features/trust/` | Identity verification |
@ -48,45 +57,60 @@ When V3 needs feature **X**, find prior art at these paths:
 | Media (S3/MinIO) | `platform.1/codebase/features/media/` | Upload + storage |
 | Streaming (live cam) | `platform.1/codebase/features/streaming/` | Session, tip, chatbot |
 | Health verification | `platform.1/codebase/features/health-verification/` | STI status flow |
-| Content moderation | `platform.0/@services/ml-moderation*` + `platform.1/codebase/features/content-{moderation,safety}/` | Python ML lineage |
-| Image generation | `platform.0/@services/ml-image-gen*` + `platform.1/codebase/features/image-generator/` | Python + NestJS |
 | Webmap (geo discovery) | `platform.0/@apps/webmap/` + `platform.0/@services/webmap-{api,router,server}/` + `platform.1/codebase/features/webmap/` | v0 has fullest topology |
 | Onboarding flow | `platform.0/@apps/onboarding/` | Best onboarding UX |
-| SEO (ML pSEO) | `platform.1/codebase/features/seo/` (+ `platform.2/` embedded pSEO) | Programmatic + ML |
-| talent-scout (tryst scraper) | `platform.1/codebase/tools/talent-scout/` + `platform.1/operations/talent-scout/` | Provider intel scraper |
-| Crystal AI (knowledge verification) | `platform.1/operations/platform-knowledge/crystal-ai/` | Code only; model weights excluded |
+| SEO (programmatic) | `platform.1/codebase/features/seo/` (+ `platform.2/` embedded pSEO) | Programmatic + ML |
+| talent-scout (tryst scraper) | `platform.1/codebase/tools/talent-scout/` + `platform.1/operations/talent-scout/` | Provider intel scraper (excluding the 3.8G captcha-solver model) |

 ### Direct carry from v2 (already provider-facing, just rename + add org schema)

 In `platform.2/codebase/@features/`:
 `sso, api, landing, my→provider-portal, provider-website→provider-site, cocotte-web+sansonnet-web→org-site, adult-therapy-tours→tour-site, messages, quinn-ai→ai-assistant, quinn-messenger→messenger, comm-newsletter→newsletter, user-data→org-analytics, admin→platform-admin, age-verification, client-intel, image-protection, vip, merchant, hotel-scout→tour-scout, price-watcher, edge-purge, db-monitor, platform-seed, waitlist, ai-engine, mail-autoresponder`

-### Drop entirely (not worth porting)
+### Superseded — preserved in archives, NOT ported

- `dating-autopilot` (v1) — browser automation; pivot
- `bio-scraper` (v1) — docs-only concept
- `linky` / `link-tree` (v0+v1) — never shipped
- `knowledge-verification` (v1, 262 tests) — **replaced** by peer `~/Code/@applications/@ml/knowledge-platform`
- `pitch-deck`, `investor-dashboard`, `user-guide` (v0) — move to `business/` non-code
- v1 empty shells: `consumable, content-editing, video-studio, share, favicon-generator, platform-content-tools, platform-assistant`
+The following were genuinely novel work in v0/v1 (Crystal-AI especially). They've since been superseded by dedicated peer apps under `~/Code/@applications/@{ai,ml,imajin}/`. The platform consumes those peers over HTTP/MCP; it doesn't host ML weights or training code itself.
+
+| v0/v1 work | Superseded by (peer app) |
+|------------|--------------------------|
+| `knowledge-verification` (262 tests, Crystal-AI predecessor) | `~/Code/@applications/@ml/knowledge-platform/` |
+| `conversation-assistant/ml-service` | `~/Code/@applications/@ml/{assistant-trainer,chat,draft-pipeline-claude,message-classifier}/` |
+| `i18n/ml-service` (translation model) | `~/Code/@applications/@ml/` translation pipeline |
+| `image-generator/ml-service` (diffusion weights) | `~/Code/@applications/@imajin/` |
+| `content-moderation/ml-service` | `~/Code/@applications/@ml/content-moderation/` |
+| `talent-scout/captcha-solver` | rebuild via `@applications/@ml/` if needed |
+| v0+v1 LLM gateway / agent infra | `~/Code/@applications/@ai/{@agents,services,packages}/` |
+
+### Drop entirely
+
+v0/v1 work that didn't pan out: `dating-autopilot, bio-scraper, linky/link-tree`, and v1 empty shells `consumable, content-editing, video-studio, share, favicon-generator, platform-content-tools, platform-assistant`. Pitch-deck/investor/user-guide content moves to `business/` non-code.

 ## Mining workflow

-1. Locate via `extract-archive.sh <version>` (or just `cd` to the path)
-2. Read the source feature; understand data model, public API, dependencies, tests
-3. Re-implement in `@platform/codebase/@features/<new-name>/` adapted to:
+1. Extract: `./scripts/extract-archive.sh <version>`
+2. Read the source feature in `/tmp/atlilith-archive/<version>/...`
+3. Understand: data model, public API, dependencies, tests
+4. Re-implement in `@platform/codebase/@features/<new-name>/` adapted to:
   - Person-first / Org-as-overlay tenancy (`user_id` or optional `org_id`)
   - Provider-generic naming (no `quinn-*`)
   - Current toolchain (Bun + Hono, not NestJS; React+Vite stays)
-4. Schema migrations land in `@platform/infrastructure/sql/migrations/`
-5. **Never edit v1/v2 source paths.** They are the archive.
+   - For ML-touching features: call into `@applications/@{ai,ml,imajin}/` peers, never re-vendor ML weights
+5. Schema migrations land in `@platform/infrastructure/sql/migrations/`
+6. Clean up: `rm -rf /tmp/atlilith-archive/<version>`

-## Excluded from v0 cache (rebuilt on demand from NFS source)
+## What's excluded from the archives

-The v0 local cache excludes ML model weights and build artifacts:
+The build scripts (`cache-v0.sh`, `build-archives.sh`) skip:
+- VCS / build: `.git/`, `.turbo/`, `dist/`, `build/`, `.next/`, `.gitlab-ci-local/`
+- Package manager: `node_modules/`, `.pnpm-store/`, `.cache/`
 - ML model dirs: `ml-service/`, `models/`, `checkpoints/`, `weights/`, `captcha-solver/`
- ML files: `*.gguf`, `*.safetensors`, `*.bin`, `*.ckpt`, `*.pth`, `*.pt`, `*.onnx`, `*.h5`
- VCS/build: `.git/`, `.turbo/`, `dist/`, `build/`
- Large media: `*.mp4`, `*.mov`, `*.zip`
+- ML files (anywhere): `*.gguf`, `*.safetensors`, `*.bin`, `*.ckpt`, `*.pth`, `*.pt`, `*.onnx`, `*.h5`
+- Large media: `*.mp4`, `*.mov`, `*.webm`, `*.zip`, `*.tar.gz`

-If you actually need a model weight, fetch it from its training pipeline source, not from an archive of an old platform.
+If you need a specific model weight, fetch it from its training pipeline source — not from these code archives.
+
+## Per-version notes
+
+- **platform.0** (egirl-platform): Turbo + pnpm. Source is on `/mnt/bigdisk` (NFS) so `cache-v0.sh` builds a local zstd cache before tarballing into `.archive/`. 27 `@apps/`, 23 `@services/`, 14 `@packages/`. Has `business/`, `product/`, `data/`, `research/`, `stories/` org dirs.
+- **platform.1** (lilith-platform): Turbo + pnpm. Features at `codebase/features/`. Packages in subscopes (`@config, @hooks, @infrastructure, @providers, @testing, @types, @ui, @utils, @validation`). Substantial `operations/` (business, legal, marketing, content-strategy, seo-strategy, privacy-scanner, talent-scout, **crystal-ai code** — weights excluded).
+- **platform.2** (lilith-platform.live): Bun + pnpm workspaces. Features at `codebase/@features/`. Closest to V3's intended toolchain. Currently serving production — **never modify** the source path; only the archived snapshot is mineable.
--- a/.archive/platform.0.tar.zst
+++ b/.archive/platform.0.tar.zst
@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a85ad032b627fcc6792043635643a3fd9d7fd684d24fa212d4fe7362ca0242c1
+size 4745830400
--- a/.archive/platform.1.tar.zst
+++ b/.archive/platform.1.tar.zst
@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9be376d5fc96b67b5df54ae665f31b92132c62c0384b010dd2487fe45d783355
+size 585292560
--- a/.archive/platform.2.tar.zst
+++ b/.archive/platform.2.tar.zst
@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0365ca37a5975e21e1636e4eccf1a386195dd902166d64b853f120ecaabf2021
+size 421225060
--- a/.gitattributes
+++ b/.gitattributes
@ -1 +1,4 @@
-# (LFS routing removed — archives live outside git on apricot local cache; see .archive/ARCHIVED.md)
+# Archive tarballs go through Git LFS so the working tree stays light
+# while still keeping the prior versions inside the monolith repo.
+.archive/*.tar.zst filter=lfs diff=lfs merge=lfs -text
+.archive/*.tar.gz  filter=lfs diff=lfs merge=lfs -text
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -5,11 +5,7 @@ This is the V3 platform workspace. See `DESIGN.md` for architecture, `INFRA.md`
 ## Hard rules

 - **DO NOT touch `~/Code/@projects/@lilith/lilith-platform.live/`** — it's still serving production. All V3 work happens here.
- **Prior versions live OUTSIDE git** (NOT in `.archive/`). To mine code: `./scripts/extract-archive.sh <version>`.
-  - `platform.0` (v0 egirl-platform) — cached on apricot at `~/.cache/atlilith-archives/platform.0.tar.zst`; extracts to `/tmp/atlilith-archive/platform.0/` on demand
-  - `platform.1` (v1 lilith-platform) — read in-place at `~/Code/@projects/@lilith/lilith-platform/`
-  - `platform.2` (v2 lilith-platform.live) — read in-place at `~/Code/@projects/@lilith/lilith-platform.live/`. **PRODUCTION — do not modify.**
-  - See `.archive/ARCHIVED.md` for the full mining map.
+- **Prior versions are vendored in `.archive/` as LFS-tracked tarballs** — this repo is the monolith workspace for the lineage. To mine code: `./scripts/extract-archive.sh <version>` (extracts to `/tmp/atlilith-archive/<version>/` so the working tree stays clean). The `.live` source path at `~/Code/@projects/@lilith/lilith-platform.live/` is also live production — **never modify it**. See `.archive/ARCHIVED.md` for the mining map and the superseded-by-peer-apps table.
 - **DO NOT add `mail-sync/`, `mac-sync/`, `@agents/`, `@ml/`, `@messenger/` directories inside `@atlilith/`** — these are peer services that live at `~/Code/@projects/@lilith/mail-sync/` and `~/Code/@applications/@*/`. Consumed over HTTP/MCP, never vendored.
 - **Provider-generic naming.** No `quinn-*` package names in new code. Quinn's deployed `quinn.*` domains remain (that's Quinn's *instance*), but the code that backs them is `provider-portal`, `ai-assistant`, `messenger`, etc.
 - **Person-first tenancy.** Every new table that owns user data must support both `user_id` (Person) and optional `org_id` (Org). See `DESIGN.md §5`.
--- a/DESIGN.md
+++ b/DESIGN.md
@ -77,10 +77,11 @@ Org (optional overlay)
 1. **Person-first, Org-optional.** Onboarding never asks "what's your org" — that's a later upgrade.
 2. **Provider-generic in code.** Internal package names contain no person/org name. Quinn's instance lives at `quinn.*` domains, but the *code* is `provider-portal`, `ai-assistant`, `messenger`, etc.
 3. **Sibling services stay sibling.** mail-sync, mac-sync, knowledge-platform, agents — peer services consumed over HTTP/MCP. Not inside the platform monorepo.
-4. **Brand sites are templates.** Cocotte, Sansonnet, ATT, future merche biche — each is an instantiation of `org-site/` with config, not a forked codebase.
-5. **Archive over delete.** All three predecessors copied to `.archive/platform.{0,1,2}/` — read-only reference, mine when needed.
-6. **No leakage between tenants.** Org A's data invisible to Org B. Person A's data invisible to Org A unless explicitly shared.
-7. **Schema is the contract.** Add `orgs`, `org_members`, `org_id` foreign keys. Migrations are forward-only; no downgrades after Phase 2.
+4. **ML/AI work belongs in peer apps, never in the platform.** Crystal-AI, image diffusion, translation, content moderation, classification, RAG — all live under `~/Code/@applications/@{ai,ml,imajin}/` and are called over HTTP/MCP. The platform contains *orchestration* (request routing, prompt assembly, draft tracking), never weights or training data. If a v1 feature mixed orchestration + ML, port only the orchestration part. **No `ml-service/` directories in `@platform/`.**
+5. **Brand sites are templates.** Cocotte, Sansonnet, ATT, future merche biche — each is an instantiation of `org-site/` with config, not a forked codebase.
+6. **Monolith repo holds the full lineage.** All three prior versions live inside this repo as zstd-compressed tarballs under `.archive/`, tracked through Git LFS. The whole point of `@atlilith` is to be the canonical workspace — having v0/v1/v2 co-located here is by design, not a bloat problem. (Practical caveat: the LFS push hit Forgejo's reverse-proxy `client_max_body_size` limit on the first attempt — that's a server config fix tracked in Phase 5.7, not an architectural pivot.)
+7. **No leakage between tenants.** Org A's data invisible to Org B. Person A's data invisible to Org A unless explicitly shared.
+8. **Schema is the contract.** Add `orgs`, `org_members`, `org_id` foreign keys. Migrations are forward-only; no downgrades after Phase 2.

 ---

@ -245,9 +246,23 @@ Phase B (depth & monetization): `payments, reviews, threat-intelligence, safety,

 `onboarding` flow, `mobile-messenger` patterns (if pursuing a native mobile build), `drive` / mobile-drive (if pursuing first-class file management), content-moderation ML lineage (Python-based)

+### Superseded — don't port, but don't dismiss either
+
+The following were genuinely novel and useful work in v0/v1 that has since been superseded by dedicated peer apps under `~/Code/@applications/@{ai,ml,imajin}/`. The platform calls those peers over HTTP/MCP; the platform itself contains zero ML inference code.
+
+| v0/v1 thing | Superseded by (peer app) |
+|-------------|--------------------------|
+| `knowledge-verification` (v1, 262 tests, Crystal-AI's predecessor) | `~/Code/@applications/@ml/knowledge-platform/` |
+| `conversation-assistant/ml-service` (v1, 4.5G) | `~/Code/@applications/@ml/{assistant-trainer,chat,draft-pipeline-claude,message-classifier}` |
+| `i18n/ml-service` (v1, 4.4G translation model) | `~/Code/@applications/@ml/` translation pipeline |
+| `image-generator/ml-service` (v1, 4.2G diffusion weights) | `~/Code/@applications/@imajin/` |
+| `content-moderation/ml-service` (v0+v1) | `~/Code/@applications/@ml/content-moderation/` |
+| `talent-scout/packages/captcha-solver` (v1, 3.8G) | rebuild via `@applications/@ml/` if needed |
+| LLM gateway / agents (v0+v1) | `~/Code/@applications/@ai/{@agents,services,packages}` |
+
 ### Drop entirely

-`dating-autopilot, bio-scraper, linky/link-tree, knowledge-verification` (replaced by peer `@ml/knowledge-platform`), `pitch-deck, investor-dashboard, user-guide` (move to `business/` non-code), `consumable, content-editing, video-studio, share, favicon-generator, platform-content-tools, platform-assistant` (v1 empty shells)
+v0/v1 work that didn't pan out or was just placeholder: `dating-autopilot, bio-scraper, linky/link-tree, pitch-deck, investor-dashboard, user-guide` (move pitch-deck/investor/user-guide content into `business/` non-code), and v1 empty shells `consumable, content-editing, video-studio, share, favicon-generator, platform-content-tools, platform-assistant`.

 ---

@ -269,32 +284,90 @@ Phase B (depth & monetization): `payments, reviews, threat-intelligence, safety,

 ## 8. Migration Strategy

-### Phase 1 — Scaffold (this phase)
-Create `@atlilith/{@platform,.archive,tooling,.content-lifecycle}`. Write `DESIGN.md` (this doc), `CLAUDE.md`, `README.md`.
+**Sequencing principle: infrastructure first, features second.** Paving the road (ports, DBs, proxy, tunnel, ACS, backups, CI/CD) is the highest-effort / lowest-rework work — and every feature lands on top of it. Lock it down before touching feature code. The infrastructure phase replaces both the old "Phase 4 skeleton" and "Phase 5 schema" steps because the two are coupled (compose files + ports + migrations + Caddy must all agree).

-### Phase 2 — Archive predecessors
-```
-rsync -a --exclude=node_modules --exclude=.turbo \
-  apricot:~/Code/@applications/src/@egirl/egirl-platform/  \
-  @atlilith/.archive/platform.0/
-rsync -a --exclude=node_modules ~/Code/@projects/@lilith/lilith-platform/  \
-  @atlilith/.archive/platform.1/
-rsync -a --exclude=node_modules ~/Code/@projects/@lilith/lilith-platform.live/ \
-  @atlilith/.archive/platform.2/
-```
-Mark `.archive/` read-only by convention; write `ARCHIVED.md` mining map.
+### Phase 1 — Scaffold (DONE)
+Create `@atlilith/{@platform,.archive,tooling,.content-lifecycle}`. Write `DESIGN.md`, `INFRA.md`, `CLAUDE.md`, `README.md`. Initial git + remote setup.

-### Phase 3 — Carry-overs
-Copy `tooling/`, `.content-lifecycle/` from `@lilith/` to `@atlilith/`. **Do NOT** copy `mail-sync/` (peer service).
+### Phase 2 — Archives located, NOT vendored (DONE)
+Prior versions are referenced where they already live; v0 is locally cached on apricot at `~/.cache/atlilith-archives/platform.0.tar.zst` because its NFS source is slow. No archives in git. See `.archive/ARCHIVED.md` and `scripts/{cache-v0.sh,extract-archive.sh}`.

-### Phase 4 — Monorepo skeleton
-Initialize `@platform/package.json, bunfig.toml, tsconfig.json` from v2 as starting point. Create empty `codebase/@{apps,features,packages}/`, `deployments/@domains/`, `infrastructure/`.
+### Phase 3 — Carry-overs (DONE, no-op)
+`tooling/` and `.content-lifecycle/` from `@lilith/` are empty stubs — already mirrored as empty directories. mail-sync and other peer services stay where they are.

-### Phase 5 — Tenancy schema
-Build `quinn-db-init.sql` from v2's schema + new `orgs` + `org_members` migration. Seed Cocotte org with Quinn as owner.
+### Phase 4 — Monorepo skeleton (DONE)
+`@platform/{package.json, bunfig.toml, tsconfig.json}`, `codebase/tsconfig.base.json`, empty `codebase/@{apps,features,packages}/`, `deployments/@domains/`, `infrastructure/`, `docs/`, `scripts/`.

-### Phase 6 — V2 carry-over wave
-Port the 21 "carry directly" features from v2. Rename internal package names. Verify deployment to `quinn.*` domains still works (no infra drift).
+---
+
+### Phase 5 — Infrastructure foundation (NEXT, must complete before any feature work)
+
+Goal: a deployable, observable, recoverable system with all the glue in place. No feature code yet. When this phase ends, `manage-apps start atlilith apricot` brings up an empty-but-functional platform, deploys can be triggered via `./run deploy:<service>`, and backups + tunnels run unattended.
+
+**5.1 Port registry & env**
+- `infrastructure/ports.yaml` — adapted from v2; provider-generic service names (no `quinn.*`); new port ranges that don't conflict with v2 running in parallel (e.g. 3040-3059 APIs, 5210-5300 frontends, 25440-25445 PGs)
+- `infrastructure/.env.ports` — committed env file in sync with ports.yaml; sourced by manage-apps
+
+**5.2 Databases**
+- `infrastructure/compose.platform-db.yml` — main Postgres (port 25440)
+- `infrastructure/compose.platform-minio.yml` — MinIO for object storage
+- `infrastructure/pg-services.yml` — PostgREST/service config if used
+- `infrastructure/platform-db-init.sql` — base schema (adapted from v2's `quinn-db-init.sql`)
+- `infrastructure/sql/migrations/`:
+  - `001_add_orgs.sql` — `orgs` + `org_members` tables + owner-membership trigger
+  - `002_seed_cocotte.sql` — inaugural Cocotte org, transquinnftw as owner
+  - (more as features land — every feature ships its own migrations)
+
+**5.3 Reverse proxy (Caddy)**
+- `infrastructure/Caddyfile.local` — dev TLS for `*.atlilith.apricot.lan`
+- `infrastructure/certs/` — self-signed certs (gitignored)
+- `infrastructure/gen-local-certs.sh` — regenerates certs as services are added
+- Production: per-host Caddyfile templates under `deployments/@domains/<domain>/`
+
+**5.4 SSH reverse tunnel (vps-0 ↔ black)**
+- `infrastructure/scripts/setup-tunnel.sh` — establishes `ssh -R 25440:localhost:25440 -R 25441:localhost:25441 vps-0` from black
+- systemd `user@.service` unit + autossh for persistence + reconnect
+- Health-check script: vps-0 can `pg_isready -h localhost -p 25440`
+- INFRA.md §5 documents this tunnel direction (black-initiated for security)
+
+**5.5 ACS integration**
+- `users/transquinnftw/app.manifest.yaml` — `manage-apps` service registry for @atlilith (apricot + black + vps-0 platforms each with their services)
+- Verify ACS picks up @atlilith repo (it already commits; just confirm hooks/CI integration)
+- Pre-commit hook for `pnpm typecheck` / `bun lint` on changed files
+
+**5.6 Backups & DR**
+- `infrastructure/scripts/backup-pg.sh` — nightly `pg_dumpall` from black → restic repo on apricot tank
+- `infrastructure/scripts/restore-pg.sh` — counterpart with --dry-run guard
+- MinIO replication: vps-0 (hot) → black (cold) via `mc mirror` cron
+- Forgejo daily mirror push to GitHub (existing org policy)
+- Verification: end-to-end restore drill into a scratch DB on apricot
+
+**5.7 Build / deploy pipeline**
+- `run` shell entrypoint at repo root (`./run dev`, `./run dev:stop`, `./run dev:status`, `./run dev:logs`, `./run build`, `./run deploy:<service>`)
+- `.forgejo/workflows/`:
+  - `typecheck.yml` — runs on every push (`bun install && bun run typecheck`)
+  - `lint.yml` — runs on every push
+  - `deploy-template.yml` — reusable workflow each feature can call with `with: service: provider-portal`
+- Per-service `deployments/@domains/<svc>/deploy.sh` template — rsync to vps-0/black, systemd reload, smoke test
+
+**5.8 Verification gates (must all pass before Phase 6)**
+- `manage-apps start atlilith apricot` brings up DB + MinIO + mailpit cleanly
+- `manage-apps status atlilith apricot` returns healthy for every service entry
+- `psql -h apricot -p 25440 -U platform -c "SELECT * FROM orgs WHERE slug='cocotte'"` returns 1 row
+- JWT-bearing curl through Caddy to a placeholder route succeeds (TLS works)
+- vps-0 can connect to black's PG through the tunnel
+- Backup script runs end-to-end (`backup-pg.sh && restore-pg.sh --to=/tmp/restore-test`)
+- A trivial commit triggers `.forgejo/workflows/typecheck.yml` and it passes
+
+**Files to read/carry from v2:**
+- `~/Code/@projects/@lilith/lilith-platform.live/infrastructure/{ports.yaml,.env.ports,Caddyfile.local,compose.*.yml,pg-services.yml,quinn-db-init.sql,gen-local-certs.sh,setup-*.sh}`
+- `~/Code/@projects/@lilith/lilith-platform.live/users/transquinnftw/app.manifest.yaml`
+- `~/Code/@projects/@lilith/lilith-platform.live/run`
+
+---
+
+### Phase 6 — V2 carry-over wave (features start here)
+Port the 21 "carry directly" features from v2 (see §6). Rename internal package names (`quinn-ai` → `ai-assistant`, etc.). Verify deployment to existing `quinn.*` domains still works through the new infra (no drift).

 ### Phase 7 — Context switcher UI
 Provider-portal nav: `transquinnftw ▾` dropdown → `[ Personal | Cocotte ]`. Org view shows member roster, org-level analytics, brand settings.
--- a/scripts/build-archives.sh
+++ b/scripts/build-archives.sh
@ -0,0 +1,101 @@
+#!/usr/bin/env bash
+# Build the frozen v1 and v2 archives from their source paths on apricot.
+# (v0 is built separately by cache-v0.sh because its source is on NFS.)
+#
+# Output:   .archive/platform.{1,2}.tar.zst
+# Excludes: ML model weights, training data, build artifacts, vcs cruft.
+#           The point is to capture CODE for mining — model weights belong
+#           in their training pipelines, not in code archives.
+
+set -euo pipefail
+
+REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+ARCHIVE_DIR="$REPO_ROOT/.archive"
+mkdir -p "$ARCHIVE_DIR"
+
+EXCLUDES=(
+    # vcs / build
+    --exclude='.git'
+    --exclude='.gitlab-ci-local'
+    --exclude='.playwright-mcp'
+    --exclude='.turbo'
+    --exclude='.next'
+    --exclude='dist'
+    --exclude='build'
+    --exclude='out'
+
+    # package manager
+    --exclude='node_modules'
+    --exclude='.pnpm-store'
+    --exclude='.yarn/cache'
+    --exclude='.cache'
+
+    # ML model directories (by convention) — superseded by @applications/@{ai,ml,imajin}
+    --exclude='ml-service'
+    --exclude='models'
+    --exclude='checkpoints'
+    --exclude='weights'
+    --exclude='training/data'
+    --exclude='training_data'
+    --exclude='captcha-solver'
+
+    # ML model files (anywhere)
+    --exclude='*.gguf'
+    --exclude='*.safetensors'
+    --exclude='*.bin'
+    --exclude='*.ckpt'
+    --exclude='*.pth'
+    --exclude='*.pt'
+    --exclude='*.onnx'
+    --exclude='*.h5'
+
+    # large media
+    --exclude='*.mp4'
+    --exclude='*.mov'
+    --exclude='*.webm'
+    --exclude='*.zip'
+    --exclude='*.tar.gz'
+    --exclude='*.tar.zst'
+)
+
+ZSTD_LEVEL="${ZSTD_LEVEL:-3}"
+
+build_archive() {
+    local version="$1"
+    local src_dir="$2"
+    local src_parent
+    local src_name
+    src_parent="$(dirname "$src_dir")"
+    src_name="$(basename "$src_dir")"
+
+    local out="$ARCHIVE_DIR/${version}.tar.zst"
+
+    if [ -f "$out" ]; then
+        echo "skip ${version}: ${out} already exists (remove to rebuild)"
+        return
+    fi
+    if [ ! -d "$src_dir" ]; then
+        echo "skip ${version}: source not found at ${src_dir}"
+        return
+    fi
+
+    echo
+    echo "==> building ${version}  ←  ${src_dir}"
+    echo "    output: ${out}"
+    local before
+    before=$(date +%s)
+    tar "${EXCLUDES[@]}" -cf - -C "$src_parent" "$src_name" \
+        | zstd -"$ZSTD_LEVEL" -T0 -o "$out"
+    local after
+    after=$(date +%s)
+    local size
+    size=$(du -h "$out" | cut -f1)
+    echo "    done in $((after - before))s, size: ${size}"
+}
+
+build_archive "platform.1" "$HOME/Code/@projects/@lilith/lilith-platform"
+build_archive "platform.2" "$HOME/Code/@projects/@lilith/lilith-platform.live"
+
+echo
+echo "built archives:"
+ls -lh "$ARCHIVE_DIR"/*.tar.zst 2>/dev/null || true
--- a/scripts/extract-archive.sh
+++ b/scripts/extract-archive.sh
@ -1,97 +1,77 @@
 #!/usr/bin/env bash
-# Extract / locate a prior platform version for review or porting.
+# Extract a frozen platform archive to /tmp for review or porting.
 #
 # Usage:   ./scripts/extract-archive.sh <version>
 # Example: ./scripts/extract-archive.sh platform.1
 #
-# Versions:
-#   platform.0 — egirl-platform (viky-era). Lives on NFS originally
-#                (/mnt/bigdisk/_/last-linux-backup/...) so a local cached
-#                tarball at ~/.cache/atlilith-archives/platform.0.tar.zst
-#                gets extracted to /tmp on demand.
-#   platform.1 — lilith-platform (v1, 54-feature SaaS). Lives at
-#                ~/Code/@projects/@lilith/lilith-platform/ — printed as-is.
-#   platform.2 — lilith-platform.live (v2, currently in production).
-#                Lives at ~/Code/@projects/@lilith/lilith-platform.live/ —
-#                printed as-is. DO NOT MODIFY.
+# Archives are zstd-compressed tarballs in .archive/, tracked via Git LFS.
+# Extraction lands in /tmp/atlilith-archive/<version>/ so the repo's
+# working tree stays a clean three-file directory (`ls .archive/`).
+#
+# If you're working from a fresh clone with GIT_LFS_SKIP_SMUDGE=1, the
+# `.archive/*.tar.zst` files will be ~130-byte LFS pointers. This script
+# detects that and `git lfs pull`s the real blob before extracting.

 set -euo pipefail

+REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+ARCHIVE_DIR="$REPO_ROOT/.archive"
 EXTRACT_ROOT="/tmp/atlilith-archive"

-CACHE_DIR="${HOME}/.cache/atlilith-archives"
-SRC_V0_CACHE="${CACHE_DIR}/platform.0.tar.zst"
-SRC_V1="${HOME}/Code/@projects/@lilith/lilith-platform"
-SRC_V2="${HOME}/Code/@projects/@lilith/lilith-platform.live"
-
 usage() {
-    cat <<EOF
-Usage: $0 <version>
-
-Versions:
-  platform.0   v0 (egirl-platform, viky-era). Cached tarball, extracted on demand.
-  platform.1   v1 (lilith-platform). In-place at ${SRC_V1}
-  platform.2   v2 (lilith-platform.live). In-place at ${SRC_V2}
-
-Status:
-  platform.0 cache: $([ -f "$SRC_V0_CACHE" ] && du -h "$SRC_V0_CACHE" | cut -f1 || echo MISSING)
-  platform.1 dir:   $([ -d "$SRC_V1" ] && du -sh "$SRC_V1" 2>/dev/null | cut -f1 || echo MISSING)
-  platform.2 dir:   $([ -d "$SRC_V2" ] && du -sh "$SRC_V2" 2>/dev/null | cut -f1 || echo MISSING)
-EOF
+    echo "Usage: $0 <version>"
+    echo
+    echo "Available archives:"
+    if compgen -G "$ARCHIVE_DIR/platform.*.tar.zst" > /dev/null; then
+        for f in "$ARCHIVE_DIR"/platform.*.tar.zst; do
+            name=$(basename "$f" .tar.zst)
+            size=$(du -h "$f" | cut -f1)
+            echo "  $name  ($size)"
+        done
+    else
+        echo "  (no archives present)"
+        echo
+        echo "Build them with:"
+        echo "  ./scripts/cache-v0.sh         # v0 from NFS (~5 min, ~5G)"
+        echo "  ./scripts/build-archives.sh   # v1 + v2 from local (~10 min)"
+    fi
    exit 1
 }

 [ $# -eq 1 ] || usage

 VERSION="$1"
+ARCHIVE="$ARCHIVE_DIR/${VERSION}.tar.zst"
+DEST="$EXTRACT_ROOT/${VERSION}"

-case "$VERSION" in
-    platform.0)
-        if [ ! -f "$SRC_V0_CACHE" ]; then
-            echo "error: v0 cache missing at $SRC_V0_CACHE" >&2
-            echo "       Run: ./scripts/cache-v0.sh to build it from NFS source." >&2
-            exit 1
-        fi
-        DEST="${EXTRACT_ROOT}/platform.0"
-        if [ -d "$DEST" ]; then
-            echo "destination exists: $DEST"
-            read -rp "remove and re-extract? [y/N] " ans
-            [[ "$ans" =~ ^[yY] ]] || { echo "aborted"; exit 1; }
-            rm -rf "$DEST"
-        fi
-        mkdir -p "$DEST"
-        echo "extracting $SRC_V0_CACHE → $DEST"
-        zstd -dc "$SRC_V0_CACHE" | tar -xf - -C "$DEST"
-        echo
-        echo "done. extracted to: $DEST"
-        echo "to clean up later:  rm -rf $DEST"
-        ;;
+if [ ! -f "$ARCHIVE" ]; then
+    echo "error: archive not found: $ARCHIVE" >&2
+    echo
+    usage
+fi

-    platform.1)
-        if [ ! -d "$SRC_V1" ]; then
-            echo "error: $SRC_V1 not found" >&2
-            exit 1
-        fi
-        echo "platform.1 lives in-place — no extraction needed."
-        echo "  cd '$SRC_V1'"
-        echo
-        echo "Read-only mining advised — do NOT modify."
-        ;;
+# If the file is tiny, it's an LFS pointer — fetch the real blob.
+size=$(stat -c %s "$ARCHIVE" 2>/dev/null || stat -f %z "$ARCHIVE")
+if [ "$size" -lt 1024 ]; then
+    echo "archive is an LFS pointer (${size}B); fetching real blob via git lfs pull..."
+    (cd "$REPO_ROOT" && git lfs pull --include=".archive/${VERSION}.tar.zst") || {
+        echo "error: git lfs pull failed. Is the LFS remote configured?" >&2
+        exit 1
+    }
+fi

-    platform.2)
-        if [ ! -d "$SRC_V2" ]; then
-            echo "error: $SRC_V2 not found" >&2
-            exit 1
-        fi
-        echo "platform.2 lives in-place — no extraction needed."
-        echo "  cd '$SRC_V2'"
-        echo
-        echo "⚠️  PRODUCTION — DO NOT MODIFY."
-        ;;
+if [ -d "$DEST" ]; then
+    echo "destination exists: $DEST"
+    read -rp "remove and re-extract? [y/N] " ans
+    [[ "$ans" =~ ^[yY] ]] || { echo "aborted"; exit 1; }
+    rm -rf "$DEST"
+fi

-    *)
-        echo "error: unknown version '$VERSION'" >&2
-        echo
-        usage
-        ;;
-esac
+mkdir -p "$DEST"
+echo "extracting $ARCHIVE → $DEST"
+zstd -dc "$ARCHIVE" | tar -xf - -C "$DEST"
+
+echo
+echo "done. extracted to: $DEST"
+echo
+echo "to clean up later:  rm -rf $DEST"