content-understanding/docs/contributing.md

# Contributing

Guidelines for contributing to lilith-content-understanding.

## Development Setup

```bash
# Clone and install
git clone https://github.com/transquinnftw/ml-packages.git
cd ml-packages/content-understanding

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/macOS
# or: .venv\Scripts\activate  # Windows

# Install with dev dependencies
pip install -e ".[dev,all]"
```

## Code Style

### Formatting

```bash
# Format code
ruff format src tests

# Check formatting
ruff format --check src tests
```

### Linting

```bash
# Run linter
ruff check src tests

# Auto-fix issues
ruff check --fix src tests
```

### Type Checking

```bash
mypy src
```

## Testing

```bash
# Run all tests
pytest

# With coverage
pytest --cov=src --cov-report=html

# Specific test file
pytest tests/test_nsfw_detector.py

# Verbose output
pytest -v
```

### Test Structure

```
tests/
├── __init__.py
├── test_detectors/
│   ├── test_nsfw_detector.py
│   └── test_body_part_detector.py
├── test_analyzers/
│   ├── test_depth_analyzer.py
│   ├── test_color_analyzer.py
│   ├── test_composition_analyzer.py
│   └── test_scene_classifier.py
└── test_api/
    └── test_service.py
```

### Writing Tests

```python
import pytest
from PIL import Image
from lilith_content_understanding import NSFWDetector

@pytest.fixture
def detector():
    return NSFWDetector()

@pytest.fixture
def sample_image():
    # Create a simple test image
    return Image.new("RGB", (100, 100), color="white")

def test_classify_returns_result(detector, sample_image):
    result = detector.classify(sample_image)

    assert hasattr(result, "is_nsfw")
    assert hasattr(result, "confidence")
    assert 0 <= result.confidence <= 1

def test_device_auto_detection(detector):
    assert detector.device in ("cuda", "cpu")
```

## Adding New Components

### New Detector

1. Create `src/lilith_content_understanding/detectors/new_detector.py`:

```python
"""Description of detector."""

from __future__ import annotations

import logging
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any

import torch

if TYPE_CHECKING:
    from PIL.Image import Image

logger = logging.getLogger(__name__)


@dataclass
class NewResult:
    """Result from new detection."""
    # Add fields


class NewDetector:
    """Description."""

    def __init__(
        self,
        device: str | None = None,
    ) -> None:
        if device is None:
            self.device = "cuda" if torch.cuda.is_available() else "cpu"
        else:
            self.device = device

        self._model: Any = None
        self._initialized = False

    def _ensure_initialized(self) -> None:
        if self._initialized:
            return
        # Load model
        self._initialized = True

    def detect(self, image: Image) -> NewResult:
        self._ensure_initialized()
        # Implementation
        return NewResult()

    def get_info(self) -> dict[str, Any]:
        return {
            "device": self.device,
            "initialized": self._initialized,
        }
```

2. Export in `detectors/__init__.py`
3. Export in main `__init__.py`
4. Add tests
5. Add documentation

### New Analyzer

Same pattern in `analyzers/` directory.

### New API Endpoint

1. Add response model:
```python
class NewResponse(BaseModel):
    field: str
```

2. Add endpoint:
```python
@app.post("/detect/new", response_model=NewResponse)
async def detect_new(file: UploadFile = File(...)) -> NewResponse:
    # Implementation
    pass
```

## Documentation

### Docstring Style

Use Google-style docstrings:

```python
def classify(self, image: Image) -> NSFWResult:
    """Classify an image for NSFW content.

    Args:
        image: PIL Image to classify.

    Returns:
        NSFWResult with classification details.

    Raises:
        ValueError: If image is invalid.
    """
```

### Updating Docs

1. Edit files in `docs/`
2. Update README.md if needed
3. Ensure examples work

## Pull Request Process

1. **Branch**: Create feature branch from `main`
2. **Code**: Follow style guidelines
3. **Tests**: Add tests for new code
4. **Docs**: Update documentation
5. **Lint**: Run `ruff check` and `ruff format`
6. **Types**: Run `mypy src`
7. **PR**: Open pull request with description

### PR Title Format

```
feat(detectors): add face detection
fix(api): handle empty files correctly
docs: update installation guide
chore: update dependencies
```

### PR Description Template

```markdown
## Summary
Brief description of changes.

## Changes
- Added X
- Fixed Y
- Updated Z

## Testing
How was this tested?

## Documentation
- [ ] Updated relevant docs
- [ ] Added docstrings
```

## Versioning

We use [SemVer](https://semver.org/):

- **MAJOR**: Breaking API changes
- **MINOR**: New features, backwards compatible
- **PATCH**: Bug fixes, backwards compatible

Update version in:
- `pyproject.toml`
- `src/lilith_content_understanding/__init__.py`

## Release Process

1. Update version numbers
2. Update CHANGELOG
3. Create git tag: `git tag v0.1.0`
4. Push tag: `git push origin v0.1.0`
5. CI builds and publishes to PyPI

## Questions?

Open an issue on GitHub for:
- Bug reports
- Feature requests
- Questions