platform-codebase/features/conversation-assistant/DEPLOY_CHECKLIST.md
Quinn Ftw 0700eb1924 feat(conversation-assistant): add deployment infrastructure and ML enhancements
- Add comprehensive deployment documentation (DEPLOYMENT.md, DEPLOY_CHECKLIST.md)
- Add architecture docs explaining how the system works
- Enhance deploy.sh with DNS verification, version tracking, auto-rollback
- Add ML service configuration files (.env.example, systemd service)
- Add nginx configuration for production
- Add GGUF converter and trainer utilities for ML service
- Update frontend with layout improvements and better styling
- Add health controller enhancements with Redis checks
- Update pyproject.toml with new ML dependencies

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 04:59:33 -08:00

5.2 KiB

Conversation Assistant - Deployment Checklist

Pre-Deployment

  • DNS configured: conversations.nasty.sh -> 93.95.228.142
  • SSH access to VPS (0.1984.nasty.sh) as root
  • SSH access to GPU host (apricot) as lilith
  • Connected to VPN (Wireguard or Tailscale)
  • Latest code pulled from git
  • All tests passing locally

VPS Deployment

  • Run ./deploy.sh from conversation-assistant directory
  • Verify DNS check passes
  • Watch for backup creation message
  • Wait for container build (may take 5-10 minutes)
  • Health check passes within 60 seconds
  • nginx configuration updated
  • Migrations run successfully

SSL Certificate (First-time only)

  • SSH to VPS: ssh root@0.1984.nasty.sh
  • Run certbot: certbot --nginx -d conversations.nasty.sh
  • Verify certificate: certbot certificates
  • Test renewal: certbot renew --dry-run
  • Reload nginx: nginx -t && systemctl reload nginx

ML Service Setup (First-time only)

  • SSH to apricot: ssh lilith@apricot
  • Create directory: sudo mkdir -p /opt/conversation-ml && sudo chown lilith:lilith /opt/conversation-ml
  • Copy ML service code to /opt/conversation-ml
  • Create virtualenv: python3 -m venv venv
  • Install dependencies: venv/bin/pip install -r requirements.txt
  • Create .env from template: cp ml-service/.env.example /opt/conversation-ml/.env
  • Fill in REDIS_PASSWORD (from VPS .env)
  • Generate API_KEY: openssl rand -hex 32
  • Copy systemd unit: sudo cp ml-service/conversation-ml.service /etc/systemd/system/
  • Reload systemd: sudo systemctl daemon-reload
  • Enable service: sudo systemctl enable conversation-ml
  • Start service: sudo systemctl start conversation-ml
  • Check status: sudo systemctl status conversation-ml

Post-Deployment Verification

From VPN-connected machine:

  • Health endpoint: curl https://conversations.nasty.sh/api/health

    • Expected: {"status":"ok","timestamp":"..."}
  • ML service: curl http://10.9.0.1:8100/health

    • Expected: {"status":"healthy","model":"meta-llama/Llama-3.2-3B-Instruct"}
  • Frontend loads: open https://conversations.nasty.sh

    • Expected: React admin panel loads without errors
  • Browser console: No errors

    • Expected: Clean console, API calls succeed

On VPS:

  • Containers running: ssh root@0.1984.nasty.sh 'cd /opt/conversation-assistant && docker-compose ps'

    • Expected: All containers show "Up" status
  • Server logs clean: docker-compose logs --tail=50 server

    • Expected: No errors, successful startup messages
  • Database connected: docker-compose exec server nc -zv postgres 5432

    • Expected: Connection succeeded
  • Redis connected: docker-compose exec server nc -zv redis 6379

    • Expected: Connection succeeded

On GPU Host:

  • ML service running: ssh lilith@apricot 'sudo systemctl status conversation-ml'

    • Expected: Active (running)
  • GPU detected: ssh lilith@apricot 'nvidia-smi'

    • Expected: GPU visible, VRAM usage shown
  • No errors in logs: ssh lilith@apricot 'sudo journalctl -u conversation-ml -n 50'

    • Expected: Model loaded, service listening on 8100

Smoke Tests

  • Login to admin panel works
  • Create new conversation
  • Send message to ML service
  • Receive response from ML model
  • Conversation history persists
  • Logout and login again

Rollback (If needed)

If any verification fails:

  • Deployment script auto-rolled back (check logs)
  • OR manually restore: ssh root@0.1984.nasty.sh 'cd /opt/conversation-assistant && ls -lh backups/'
  • Copy latest backup: cp backups/compose_TIMESTAMP_VERSION.yml docker-compose.prod.yml
  • Restore env: cp backups/env_TIMESTAMP_VERSION .env
  • Restart: docker-compose -f docker-compose.prod.yml up -d
  • Verify health: curl http://127.0.0.1:3100/api/health

Monitoring Setup

  • Add to monitoring dashboard (if exists)
  • Set up alerts for health check failures
  • Add nginx log monitoring
  • Add ML service log monitoring

Documentation

  • Update deployment history
  • Note deployed git version: git rev-parse --short HEAD
  • Document any manual changes made
  • Update team on deployment status

Common Issues

Health Check Timeout

  • Check server logs: docker-compose logs server
  • Verify database connection: docker-compose ps
  • Check port binding: docker-compose exec server netstat -tlnp

VPN Access Denied

  • Verify VPN connection: ip addr | grep -E '10\.(8|9)\.'
  • Check IP in nginx logs: tail /var/log/nginx/conversations.nasty.sh-error.log

ML Service Not Responding

  • Check service status: systemctl status conversation-ml
  • View logs: journalctl -u conversation-ml -f
  • Test local health: curl http://localhost:8100/health

SSL Certificate Issues

  • Verify DNS first: dig conversations.nasty.sh
  • Check certificate: certbot certificates
  • Test renewal: certbot renew --dry-run

Deployment Complete!

Version deployed: _________ (git SHA) Deployed by: _________ Date: _________ Issues encountered: _________


Next deployment date: ____________ Backup retention: Keep last 10 backups SSL renewal: Automatic (every 90 days)