How to remove duplicate images


Thanks for great contribution.
I want to remove duplicate images from my dataset. How can I do? Console log prints a warning message for duplicate images but still upload to the dataset. Actually ı dont want to upload same images.

Hi @merved

Glad you enjoy Remo!

Remo should automatically not allow for duplicated images to be uploaded within a same dataset. If you try to upload duplicated images, it would skip the duplicates and print a summary of what duplicates were skipped in the notification bell.

Let me know if you still have some issues with this!


Thanks for quick reply :slight_smile:
Actually, problem is continuing. I checked number of items that I uploaded. Also I’m looking at the remo dataset folder is the same number of items.
If remo does not permission duplicate data, number of items should be reduced.

I didn’t explain the behaviour properly, let me clarify:

  • When adding an image to a dataset, Remo checks if that image is already contained in the dataset

  • It does two checks: one based on name and one based on content. Results of the upload are visible from the notification bell here:

  • if an image has the same filename as one in the dataset, it gets skipped. In the notification, you would get something like this:
3 Images uploaded
3 Errors:
image_03039.jpg: not added (Duplicated image)
image_03032.jpg: not added (Duplicated image)
image_03041.jpg: not added (Duplicated image)
  • if an image has the same content as one in the dataset, but a different filename: it gets added, but we print a warning
1 Images uploaded
1 Error:
image_03032.jpg: not added (Duplicated image)

It sounds like you are uploading images that have different filenames but same content…is that the case?

We can look to add an option (to both the interface and the python library) that lets you choose whether to skip images that have the same content - would that be helpful?

yes, that is the case.

If possible, I appreciate. It would be help this option.

Thank you.

1 Like

Ok sure! We should be able to add this to the next release. Will keep you posted :slight_smile:

Thanks for the feedback!

1 Like