lilith/content-understanding

Fork 0

Lilith d2a98b0345 chore: initial commit for lilith-content-understanding package

2026-01-05 17:48:43 -08:00

8.5 KiB

Raw Permalink Blame History

Analyzers

Analyzers provide rich analysis returning detailed insights about images.

DepthAnalyzer

Monocular depth estimation using transformer-based models.

Models

Model	Description
`depth-anything-v2-small`	Fast, good accuracy (default)
`depth-anything-v2-base`	Better accuracy, slower
`midas-small`	Intel MiDaS hybrid model

Basic Usage

from PIL import Image
from lilith_content_understanding import DepthAnalyzer

analyzer = DepthAnalyzer()
image = Image.open("photo.jpg")

result = analyzer.estimate(image)

# Get depth map
print(f"Size: {result.width}x{result.height}")
print(f"Depth range: {result.min_depth:.2f} to {result.max_depth:.2f}")

# Save visualization
result.save_visualization("depth.png", colormap="magma")

# Get PIL Image
depth_image = result.to_pil(colormap="viridis")

# Query specific point
depth_at_center = result.get_depth_at(0.5, 0.5)  # Normalized coords

# Get foreground mask
foreground = result.get_foreground_mask(threshold=0.3)

# Segment into depth layers
layers = result.get_depth_layers(num_layers=3)

Configuration

analyzer = DepthAnalyzer(
    model_name="depth-anything-v2-small",  # Model to use
    device="cuda",  # Force GPU
)

DepthResult Fields

Field	Type	Description
`depth_map`	NDArray[float32]	2D array of normalized depths (0-1)
`width`	int	Depth map width
`height`	int	Depth map height
`min_depth`	float	Original minimum depth value
`max_depth`	float	Original maximum depth value

Colormaps

Available colormaps for visualization:

magma (default) - Good for depth
viridis - Perceptually uniform
plasma - High contrast
inferno - Alternative to magma

ColorAnalyzer

Color palette extraction using k-means clustering.

Basic Usage

from PIL import Image
from lilith_content_understanding import ColorAnalyzer

analyzer = ColorAnalyzer()
image = Image.open("photo.jpg")

result = analyzer.extract_palette(image, num_colors=5)

# Get colors
print(f"Hex colors: {result.hex_colors}")
print(f"Dominant: {result.dominant_color.hex}")

# Color analysis
print(f"Harmony: {result.harmony_type}")
print(f"Mood: {result.mood}")
print(f"Avg saturation: {result.average_saturation:.1f}%")
print(f"Avg lightness: {result.average_lightness:.1f}%")

# Individual colors
for color in result.colors:
    print(f"{color.name}: {color.hex} ({color.percentage:.1f}%)")
    print(f"  RGB: {color.rgb}")
    print(f"  HSL: H={color.hsl[0]:.0f} S={color.hsl[1]:.0f} L={color.hsl[2]:.0f}")

# Generate CSS gradient
css = result.to_css_gradient()

# Create swatch image
swatch = result.to_swatch_image(width=500, height=100)
swatch.save("palette.png")

Configuration

analyzer = ColorAnalyzer(
    resize_max=200,       # Resize for faster analysis
    min_saturation=0.05,  # Filter out grays
)

Color Harmony Types

Type	Description
`monochromatic`	Single hue variations
`complementary`	Opposite hues
`analogous`	Adjacent hues
`triadic`	Three equidistant hues
`split-complementary`	Complement + neighbors
`compound`	Complex relationship

Mood Detection

Mood	Characteristics
`airy`	Light, low saturation
`dark`	Low lightness
`vibrant`	High saturation
`muted`	Low saturation
`warm`	Red/orange hues
`cool`	Blue hues
`energetic`	Yellow/green hues
`natural`	Green hues
`neutral`	No dominant character

Palette Comparison

palette1 = analyzer.extract_palette(image1)
palette2 = analyzer.extract_palette(image2)

similarity = analyzer.compare_palettes(palette1, palette2)
print(f"Overall: {similarity['overall']:.1%}")
print(f"Hue: {similarity['hue']:.1%}")
print(f"Saturation: {similarity['saturation']:.1%}")
print(f"Lightness: {similarity['lightness']:.1%}")

CompositionAnalyzer

Analyzes compositional elements of images.

Basic Usage

from PIL import Image
from lilith_content_understanding import CompositionAnalyzer

analyzer = CompositionAnalyzer()
image = Image.open("photo.jpg")

result = analyzer.analyze(image)

# Composition scores
print(f"Rule of thirds: {result.rule_of_thirds_score:.2f}")
print(f"Horizontal symmetry: {result.symmetry_score:.2f}")
print(f"Vertical symmetry: {result.vertical_symmetry_score:.2f}")
print(f"Balance: {result.balance_score:.2f} ({result.balance_type})")

# Visual complexity
print(f"Complexity: {result.complexity_score:.2f}")
print(f"Negative space: {result.negative_space_ratio:.1%}")

# Visual weight center
x, y = result.visual_weight_center
print(f"Weight center: ({x:.2f}, {y:.2f})")

# Focal points
for fp in result.focal_points:
    print(f"Focal point at ({fp.x:.2f}, {fp.y:.2f})")
    print(f"  Strength: {fp.strength:.2f}")
    print(f"  Quadrant: {fp.quadrant}")
    print(f"  On thirds: {fp.on_thirds_intersection}")

# Improvement suggestions
for suggestion in result.suggestions:
    print(f"- {suggestion}")

# Quick check
print(f"Well composed: {result.is_well_composed}")
print(f"Primary focal point: {result.primary_focal_point}")

Configuration

analyzer = CompositionAnalyzer(
    resize_max=400,  # Resize for faster analysis
)

Composition Scores

Score	Description	Good Value
`rule_of_thirds_score`	Subject alignment with thirds	> 0.6
`symmetry_score`	Horizontal symmetry	> 0.7
`balance_score`	Visual weight distribution	> 0.7
`complexity_score`	Visual complexity (0=simple)	0.3-0.7
`negative_space_ratio`	Empty area ratio	0.2-0.5

Balance Types

Type	Description
`symmetric`	Even weight distribution
`asymmetric`	Intentionally uneven but balanced
`unbalanced`	Poor weight distribution

SceneClassifier

Scene type classification using CLIP zero-shot classification.

Models

Model	Description
`clip-vit-base`	Fast, good accuracy (default)
`clip-vit-large`	Better accuracy, slower

Basic Usage

from PIL import Image
from lilith_content_understanding import SceneClassifier

classifier = SceneClassifier()
image = Image.open("photo.jpg")

result = classifier.classify(image)

# Scene type
print(f"Scene: {result.scene_type} ({result.scene_confidence:.1%})")
print(f"Environment: {result.environment}")

# Context (outdoor only)
if result.is_outdoor:
    print(f"Time of day: {result.time_of_day}")
    print(f"Weather: {result.weather}")

# Tags and suggestions
print(f"Tags: {result.tags}")
print(f"Suggested styles: {result.suggested_styles}")

# All scores
for scene, score in sorted(result.all_scores.items(), key=lambda x: -x[1]):
    print(f"  {scene}: {score:.1%}")

Configuration

classifier = SceneClassifier(
    model_name="clip-vit-base",  # CLIP model
    device="cuda",  # Force GPU
)

Scene Categories

Category	Examples
`portrait`	Headshots, selfies, people
`landscape`	Mountains, valleys, vistas
`urban`	Cities, streets, architecture
`interior`	Rooms, indoor spaces
`nature`	Forests, gardens, plants
`water`	Ocean, lakes, rivers
`sky`	Clouds, sunsets, stars
`food`	Meals, dishes, cooking
`animal`	Pets, wildlife
`abstract`	Patterns, textures
`fantasy`	Magical, mythical
`scifi`	Futuristic, space

Environment Detection

Environment	Description
`outdoor`	Outside scenes
`indoor`	Interior spaces
`studio`	Controlled studio setting

Time of Day (Outdoor)

day - Daytime
night - Nighttime
sunset - Sunset/dusk
sunrise - Sunrise/dawn

Weather (Outdoor)

sunny - Clear, bright
cloudy - Overcast
rainy - Rain, storms
snowy - Snow, winter
foggy - Fog, mist

Performance Tips

GPU Acceleration

All analyzers auto-detect CUDA:

print(f"GPU enabled: {analyzer.is_gpu_enabled}")
analyzer = DepthAnalyzer(device="cuda")  # Force GPU

Lazy Loading

Models load on first use:

analyzer = DepthAnalyzer()  # Fast
result = analyzer.estimate(image)  # Model loads here

Resize for Speed

Analyzers resize internally, but you can control it:

# Smaller = faster, less accurate
analyzer = ColorAnalyzer(resize_max=100)
analyzer = CompositionAnalyzer(resize_max=200)

Health Checks

info = analyzer.get_info()

8.5 KiB Raw Permalink Blame History

Analyzers

DepthAnalyzer

Models

Basic Usage

Configuration

DepthResult Fields

Colormaps

ColorAnalyzer

Basic Usage

Configuration

Color Harmony Types

Mood Detection

Palette Comparison

CompositionAnalyzer

Basic Usage

Configuration

Composition Scores

Balance Types

SceneClassifier

Models

Basic Usage

Configuration

Scene Categories

Environment Detection

Time of Day (Outdoor)

Weather (Outdoor)

Performance Tips

GPU Acceleration

Lazy Loading

Resize for Speed

Health Checks

8.5 KiB

Raw Permalink Blame History