Keep your image collection clean and reliable.
Much better than simple approaches like file hashes.
Detect duplicates independent of their size (from thumbnails to high-res images).
Detect duplicate images with different file types or encoding quality.
Detect near-duplicate images with a slightly changed appearance (brightness, contrast or saturation).
Removed or changed EXIF or IPTC metadata does not affect duplicate detection.
Several options exist to tune and control a scan.
Configure the detection sensitivity to find exact duplicates only or near-duplicates as well.
A score is calculated for every match indicating the likelihood it actually is a duplicate.
Add filter criteria to only scan parts of your collection. For instance, only scan images uploaded after a specific date or in a specific category.
Periodically scan your complete image collection in the background. Or just trigger a scan before indexing new content.
Implement different actions to manage duplicates depending on your use case.
Prevent users uploading duplicate images by rejecting the upload before indexing.
Associate existing duplicates to each other by automatically linking to the other version.
Merge two duplicate documents into one to have all information stored in one place.
Delete detected duplicates that are already part of your collection to clean your database.