Adds `setup-vllm` action that outputs a full post-mesh script + systemd unit to:
- install vLLM on the raw Ubuntu GPU droplet (post NVIDIA)
- serve a recommended 2026 frontier OSS model (DeepSeek-R1-Distill-Llama-70B / Qwen3 equiv etc.) as OpenAI-compatible server on :8000
- notes for registering with model-boss coordinator so prospect tasks (prospect.classify, prospect.draft) route to the GPU
Updated create next-steps + usage to call it.
Once running + registered:
- quinn.api prospector (runner, classifier paths) uses it via existing model-boss + PROSPECT_LLM_BACKEND=modelboss + draft_engine=task
- Our fast local classifier (added earlier) remains as zero-cost pre-filter
- Pastebin canon stays live/dynamic
- Model updates: restart service or LoRA; no prospector code change needed
Matches uvlava TF GPU patterns (WG 10.9.0.6, raw image) and the replace-claude-deps goal. Use the script for raw/one-off; long-term TF in uvlava with gpu_enabled + custom user_data.