content-understanding/docs/contributing.md

286 lines
5.2 KiB
Markdown

# Contributing
Guidelines for contributing to lilith-content-understanding.
## Development Setup
```bash
# Clone and install
git clone https://github.com/transquinnftw/ml-packages.git
cd ml-packages/content-understanding
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/macOS
# or: .venv\Scripts\activate # Windows
# Install with dev dependencies
pip install -e ".[dev,all]"
```
## Code Style
### Formatting
```bash
# Format code
ruff format src tests
# Check formatting
ruff format --check src tests
```
### Linting
```bash
# Run linter
ruff check src tests
# Auto-fix issues
ruff check --fix src tests
```
### Type Checking
```bash
mypy src
```
## Testing
```bash
# Run all tests
pytest
# With coverage
pytest --cov=src --cov-report=html
# Specific test file
pytest tests/test_nsfw_detector.py
# Verbose output
pytest -v
```
### Test Structure
```
tests/
├── __init__.py
├── test_detectors/
│ ├── test_nsfw_detector.py
│ └── test_body_part_detector.py
├── test_analyzers/
│ ├── test_depth_analyzer.py
│ ├── test_color_analyzer.py
│ ├── test_composition_analyzer.py
│ └── test_scene_classifier.py
└── test_api/
└── test_service.py
```
### Writing Tests
```python
import pytest
from PIL import Image
from lilith_content_understanding import NSFWDetector
@pytest.fixture
def detector():
return NSFWDetector()
@pytest.fixture
def sample_image():
# Create a simple test image
return Image.new("RGB", (100, 100), color="white")
def test_classify_returns_result(detector, sample_image):
result = detector.classify(sample_image)
assert hasattr(result, "is_nsfw")
assert hasattr(result, "confidence")
assert 0 <= result.confidence <= 1
def test_device_auto_detection(detector):
assert detector.device in ("cuda", "cpu")
```
## Adding New Components
### New Detector
1. Create `src/lilith_content_understanding/detectors/new_detector.py`:
```python
"""Description of detector."""
from __future__ import annotations
import logging
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any
import torch
if TYPE_CHECKING:
from PIL.Image import Image
logger = logging.getLogger(__name__)
@dataclass
class NewResult:
"""Result from new detection."""
# Add fields
class NewDetector:
"""Description."""
def __init__(
self,
device: str | None = None,
) -> None:
if device is None:
self.device = "cuda" if torch.cuda.is_available() else "cpu"
else:
self.device = device
self._model: Any = None
self._initialized = False
def _ensure_initialized(self) -> None:
if self._initialized:
return
# Load model
self._initialized = True
def detect(self, image: Image) -> NewResult:
self._ensure_initialized()
# Implementation
return NewResult()
def get_info(self) -> dict[str, Any]:
return {
"device": self.device,
"initialized": self._initialized,
}
```
2. Export in `detectors/__init__.py`
3. Export in main `__init__.py`
4. Add tests
5. Add documentation
### New Analyzer
Same pattern in `analyzers/` directory.
### New API Endpoint
1. Add response model:
```python
class NewResponse(BaseModel):
field: str
```
2. Add endpoint:
```python
@app.post("/detect/new", response_model=NewResponse)
async def detect_new(file: UploadFile = File(...)) -> NewResponse:
# Implementation
pass
```
## Documentation
### Docstring Style
Use Google-style docstrings:
```python
def classify(self, image: Image) -> NSFWResult:
"""Classify an image for NSFW content.
Args:
image: PIL Image to classify.
Returns:
NSFWResult with classification details.
Raises:
ValueError: If image is invalid.
"""
```
### Updating Docs
1. Edit files in `docs/`
2. Update README.md if needed
3. Ensure examples work
## Pull Request Process
1. **Branch**: Create feature branch from `main`
2. **Code**: Follow style guidelines
3. **Tests**: Add tests for new code
4. **Docs**: Update documentation
5. **Lint**: Run `ruff check` and `ruff format`
6. **Types**: Run `mypy src`
7. **PR**: Open pull request with description
### PR Title Format
```
feat(detectors): add face detection
fix(api): handle empty files correctly
docs: update installation guide
chore: update dependencies
```
### PR Description Template
```markdown
## Summary
Brief description of changes.
## Changes
- Added X
- Fixed Y
- Updated Z
## Testing
How was this tested?
## Documentation
- [ ] Updated relevant docs
- [ ] Added docstrings
```
## Versioning
We use [SemVer](https://semver.org/):
- **MAJOR**: Breaking API changes
- **MINOR**: New features, backwards compatible
- **PATCH**: Bug fixes, backwards compatible
Update version in:
- `pyproject.toml`
- `src/lilith_content_understanding/__init__.py`
## Release Process
1. Update version numbers
2. Update CHANGELOG
3. Create git tag: `git tag v0.1.0`
4. Push tag: `git push origin v0.1.0`
5. CI builds and publishes to PyPI
## Questions?
Open an issue on GitHub for:
- Bug reports
- Feature requests
- Questions