Lilith d2a98b0345 chore: initial commit for lilith-content-understanding package

2026-01-05 17:48:43 -08:00

5.2 KiB

Raw Blame History

Contributing

Guidelines for contributing to lilith-content-understanding.

Development Setup

# Clone and install
git clone https://github.com/transquinnftw/ml-packages.git
cd ml-packages/content-understanding

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/macOS
# or: .venv\Scripts\activate  # Windows

# Install with dev dependencies
pip install -e ".[dev,all]"

Code Style

Formatting

# Format code
ruff format src tests

# Check formatting
ruff format --check src tests

Linting

# Run linter
ruff check src tests

# Auto-fix issues
ruff check --fix src tests

Type Checking

mypy src

Testing

# Run all tests
pytest

# With coverage
pytest --cov=src --cov-report=html

# Specific test file
pytest tests/test_nsfw_detector.py

# Verbose output
pytest -v

Test Structure

tests/
├── __init__.py
├── test_detectors/
│   ├── test_nsfw_detector.py
│   └── test_body_part_detector.py
├── test_analyzers/
│   ├── test_depth_analyzer.py
│   ├── test_color_analyzer.py
│   ├── test_composition_analyzer.py
│   └── test_scene_classifier.py
└── test_api/
    └── test_service.py

Writing Tests

import pytest
from PIL import Image
from lilith_content_understanding import NSFWDetector

@pytest.fixture
def detector():
    return NSFWDetector()

@pytest.fixture
def sample_image():
    # Create a simple test image
    return Image.new("RGB", (100, 100), color="white")

def test_classify_returns_result(detector, sample_image):
    result = detector.classify(sample_image)

    assert hasattr(result, "is_nsfw")
    assert hasattr(result, "confidence")
    assert 0 <= result.confidence <= 1

def test_device_auto_detection(detector):
    assert detector.device in ("cuda", "cpu")

Adding New Components

New Detector

Create src/lilith_content_understanding/detectors/new_detector.py:

"""Description of detector."""

from __future__ import annotations

import logging
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any

import torch

if TYPE_CHECKING:
    from PIL.Image import Image

logger = logging.getLogger(__name__)


@dataclass
class NewResult:
    """Result from new detection."""
    # Add fields


class NewDetector:
    """Description."""

    def __init__(
        self,
        device: str | None = None,
    ) -> None:
        if device is None:
            self.device = "cuda" if torch.cuda.is_available() else "cpu"
        else:
            self.device = device

        self._model: Any = None
        self._initialized = False

    def _ensure_initialized(self) -> None:
        if self._initialized:
            return
        # Load model
        self._initialized = True

    def detect(self, image: Image) -> NewResult:
        self._ensure_initialized()
        # Implementation
        return NewResult()

    def get_info(self) -> dict[str, Any]:
        return {
            "device": self.device,
            "initialized": self._initialized,
        }

Export in detectors/__init__.py
Export in main __init__.py
Add tests
Add documentation

New Analyzer

Same pattern in analyzers/ directory.

New API Endpoint

Add response model:

class NewResponse(BaseModel):
    field: str

Add endpoint:

@app.post("/detect/new", response_model=NewResponse)
async def detect_new(file: UploadFile = File(...)) -> NewResponse:
    # Implementation
    pass

Documentation

Docstring Style

Use Google-style docstrings:

def classify(self, image: Image) -> NSFWResult:
    """Classify an image for NSFW content.

    Args:
        image: PIL Image to classify.

    Returns:
        NSFWResult with classification details.

    Raises:
        ValueError: If image is invalid.
    """

Updating Docs

Edit files in docs/
Update README.md if needed
Ensure examples work

Pull Request Process

Branch: Create feature branch from main
Code: Follow style guidelines
Tests: Add tests for new code
Docs: Update documentation
Lint: Run ruff check and ruff format
Types: Run mypy src
PR: Open pull request with description

PR Title Format

feat(detectors): add face detection
fix(api): handle empty files correctly
docs: update installation guide
chore: update dependencies

PR Description Template

## Summary
Brief description of changes.

## Changes
- Added X
- Fixed Y
- Updated Z

## Testing
How was this tested?

## Documentation
- [ ] Updated relevant docs
- [ ] Added docstrings

Versioning

We use SemVer:

MAJOR: Breaking API changes
MINOR: New features, backwards compatible
PATCH: Bug fixes, backwards compatible

Update version in:

pyproject.toml
src/lilith_content_understanding/__init__.py

Release Process

Update version numbers
Update CHANGELOG
Create git tag: git tag v0.1.0
Push tag: git push origin v0.1.0
CI builds and publishes to PyPI

Questions?

Open an issue on GitHub for:

Bug reports
Feature requests
Questions

5.2 KiB Raw Blame History