Evaluation API
This page documents the evaluation APIs for testing watermark robustness and quality.
Datasets
Dataset classes for loading prompts and test data.
MSCOCODataset
- class evaluation.dataset.MSCOCODataset
Dataset for loading MS-COCO captions and images.
- Parameters:
parquet_file – Path to the parquet file containing COCO data
max_samples – Maximum number of samples to load (optional)
StableDiffusionPromptsDataset
- class evaluation.dataset.StableDiffusionPromptsDataset
Dataset for loading text prompts for Stable Diffusion.
- Parameters:
parquet_file – Path to the parquet file containing prompts
max_samples – Maximum number of samples to load (optional)
VBenchDataset
- class evaluation.dataset.VBenchDataset
Dataset for loading VBench video prompts.
- Parameters:
prompt_file – Path to the text file containing prompts
max_samples – Maximum number of samples to load (optional)
Evaluation Pipelines
Detection Pipelines
- class evaluation.pipelines.detection.WatermarkedMediaDetectionPipeline
Pipeline for evaluating detection performance on watermarked media.
Key Methods:
run(watermark, dataset, **kwargs)- Run detection evaluationget_results()- Get evaluation results
- class evaluation.pipelines.detection.UnWatermarkedMediaDetectionPipeline
Pipeline for evaluating false positive rate on unwatermarked media.
Key Methods:
run(watermark, dataset, **kwargs)- Run detection evaluationget_results()- Get evaluation results
Quality Analysis Pipelines
- class evaluation.pipelines.image_quality_analysis.DirectImageQualityAnalysisPipeline
Pipeline for analyzing image quality directly without reference.
Key Methods:
run(watermark, dataset, quality_analyzers, **kwargs)- Run quality analysisget_results()- Get analysis results
- class evaluation.pipelines.video_quality_analysis.DirectVideoQualityAnalysisPipeline
Pipeline for analyzing video quality.
Key Methods:
run(watermark, dataset, quality_analyzers, **kwargs)- Run quality analysisget_results()- Get analysis results
Evaluation Tools
Image Attacks/Editors
Common image attack methods for testing watermark robustness:
- class evaluation.tools.image_editor.JPEGCompression
JPEG compression attack.
- Parameters:
quality – JPEG quality (0-100)
- class evaluation.tools.image_editor.GaussianBlur
Gaussian blur attack.
- Parameters:
kernel_size – Size of the Gaussian kernel
- class evaluation.tools.image_editor.GaussianNoise
Gaussian noise attack.
- Parameters:
std – Standard deviation of the noise
- class evaluation.tools.image_editor.Rotation
Rotation attack.
- Parameters:
angle – Rotation angle in degrees
- class evaluation.tools.image_editor.CenterCrop
Center crop attack.
- Parameters:
crop_ratio – Ratio of image to keep (0-1)
Quality Analyzers
Image quality metrics:
- class evaluation.tools.image_quality_analyzer.PSNRAnalyzer
Peak Signal-to-Noise Ratio analyzer.
- class evaluation.tools.image_quality_analyzer.SSIMAnalyzer
Structural Similarity Index analyzer.
Video quality metrics:
- class evaluation.tools.video_quality_analyzer.SubjectConsistencyAnalyzer
Video subject consistency analyzer.
Success Rate Calculator
- class evaluation.tools.success_rate_calculator.DynamicThresholdSuccessRateCalculator
Calculate detection success rates with dynamic thresholds.
Example Usage:
from markdiffusion.evaluation.dataset import MSCOCODataset
from markdiffusion.evaluation.pipelines.detection import WatermarkedMediaDetectionPipeline
from markdiffusion.evaluation.tools.image_editor import JPEGCompression
from markdiffusion.evaluation.tools.success_rate_calculator import DynamicThresholdSuccessRateCalculator
# Load dataset
dataset = MSCOCODataset('dataset/mscoco/mscoco.parquet', max_samples=100)
# Create pipeline
pipeline = WatermarkedMediaDetectionPipeline(
attack=JPEGCompression(quality=50),
success_rate_calculator=DynamicThresholdSuccessRateCalculator()
)
# Run evaluation
pipeline.run(watermark, dataset)
results = pipeline.get_results()
print(results)
from markdiffusion.evaluation.pipelines.image_quality_analysis import DirectImageQualityAnalysisPipeline
from markdiffusion.evaluation.dataset import StableDiffusionPromptsDataset
from markdiffusion.evaluation.tools.image_quality_analyzer import PSNRAnalyzer, SSIMAnalyzer
# Load dataset
dataset = StableDiffusionPromptsDataset('dataset/prompts.parquet', max_samples=50)
# Create pipeline with quality analyzers
pipeline = DirectImageQualityAnalysisPipeline(
quality_analyzers=[PSNRAnalyzer(), SSIMAnalyzer()]
)
# Run analysis
pipeline.run(watermark, dataset)
results = pipeline.get_results()
print(results)
Note
For detailed evaluation examples and workflows, see Evaluation.