FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.
fastdup C++ info received: 2023-05-20 04:46:25 [INFO] Going to loop over dir /tmp/tmpaeboyuub.csv
2023-05-20 04:46:26 [INFO] Found total 10000 images to run on, 10000 train, 0 test, name list 10000, counter 10000
2023-05-20 04:48:59 [ERROR] Error: found invalid bounding box for image coco_minitrain_25k/images/train2017/000000528201.jpg. Please check bounding box file 264 341 0 5
Error: found invalid bounding box for image coco_minitrain_25k/images/train2017/000000528201.jpg. Please check bounding box file 264 341 0 5
FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.
fastdup C++ info received: 2023-05-20 04:50:46 [INFO] Going to loop over dir /tmp/crops_input.csv
2023-05-20 04:50:46 [INFO] Found total 9999 images to run on, 9999 train, 0 test, name list 9999, counter 9999
2023-05-20 04:50:46 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:47 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - file does not existMissing file missing_file - file does not exist2023-05-20 04:50:48 [ERROR] Missing file missing_file - fil
########################################################################################
Dataset Analysis Summary:
Dataset contains 183544 images
Valid images are 4.94% (9,067) of the data, invalid are 95.06% (174,477) of the data
For a detailed analysis, use `.invalid_instances()`.
Similarity: 0.26% (476) belong to 5 similarity clusters (components).
99.74% (183,068) images do not belong to any similarity cluster.
Largest cluster has 1,940 (1.06%) images.
For a detailed analysis, use `.connected_components()`
(similarity threshold used is 0.9, connected component threshold used is 0.96).
Outliers: 0.67% (1,228) of images are possible outliers, and fall in the bottom 5.00% of similarity values.
For a detailed list of outliers, use `.outliers()`.