#!/usr/bin/env python # coding: utf-8 # [![image](https://raw.githubusercontent.com/visual-layer/visuallayer/main/imgs/vl_horizontal_logo.png)](https://www.visual-layer.com) # # Face Detection from Videos # # [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/video-face-detection.ipynb) # [![Open in Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/fastdup/blob/main/examples/video-face-detection.ipynb) # # In this tutorial, we will use fastdup with a face detection model to detect and crop from videos. Following that we analyze the cropped faces for issues such as duplicates, near-duplicates, outliers, bright/dark/blurry faces. # ## Installation & Setting Up # In[ ]: get_ipython().system('pip install fastdup kaggle -Uq') # In[1]: import fastdup fastdup.__version__ # ## Download & Extract Dataset # Let's download a Tiktok [trending video dataset](https://www.kaggle.com/datasets/erikvdven/tiktok-trending-december-2020) from Kaggle. The dataset consists of the first 1000 trending videos scraped from TikTok on December 2020. # # You can download the dataset by manually by heading to the dataset [homepage](https://www.kaggle.com/datasets/erikvdven/tiktok-trending-december-2020) or using the [Kaggle API](https://github.com/Kaggle/kaggle-api). # # Let's use the Kaggle API to download the dataset: # In[ ]: get_ipython().system('kaggle datasets download -d erikvdven/tiktok-trending-december-2020') # Unzip the dataset into a folder called `data`. # In[2]: get_ipython().system('unzip -q tiktok-trending-december-2020 -d data') # ## Video to Images # # fastdup works on images. We must first turn the videos into frames of images. # # We can use a one-liner fastdup utility function to turn all the videos in a folder into frames: # In[3]: fastdup.extract_video_frames(input_dir="data", work_dir="frames") # ## Run fastdup # # Now that we have the frames of images, let's run fastdup and analyze the frames. # In[4]: fd = fastdup.create(input_dir='frames') # In[5]: fd.run(bounding_box='face') # ## Components Gallery # # We can visualize the cluster of similar detections using the components gallery view. Specify `draw_bbox=True` to see the detection bounding box on the original image. # In[6]: fd.vis.component_gallery(draw_bbox=True) # If you'd like to view just the cropped bounding box images, specify `draw_bbox=False` # In[7]: fd.vis.component_gallery(draw_bbox=False) # ## Find Similar Faces Across Videos # # Using the `similarity_gallery` view, we can find similar looking faces (bounding boxes) across all the extracted frames. # In[8]: fd.vis.similarity_gallery(draw_bbox=False) # ## Find Outliers # # Useing the `outliers_gallery` we can also viaualize faces (detections) that looks visually different from others. # In[9]: fd.vis.outliers_gallery() # ## Duplicate Faces # # With the `duplicates_gallery` view, visualize duplicate image pairs across videos. # In[11]: fd.vis.duplicates_gallery() # ## Dark Faces # # Using the `stats_gallery` view, we can sort the faces (detections) following a desired `metric` such as 'dark', 'bright' and 'blur'. # In[13]: fd.vis.stats_gallery(metric='dark') # ## Bright Faces # In[14]: fd.vis.stats_gallery(metric='bright') # ## Blurry Faces # In[15]: fd.vis.stats_gallery(metric='blur') # ## Wrap Up # # Next, feel free to check out other tutorials - # # + โšก [**Quickstart**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/quick-dataset-analysis.ipynb): Learn how to install fastdup, load a dataset and analyze it for potential issues such as duplicates/near-duplicates, broken images, outliers, dark/bright/blurry images, and view visually similar image clusters. If you're new, start here! # + ๐Ÿงน [**Clean Image Folder**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/cleaning-image-dataset.ipynb): Learn how to analyze and clean a folder of images from potential issues and export a list of problematic files for further action. If you have an unorganized folder of images, this is a good place to start. # + ๐Ÿ–ผ [**Analyze Image Classification Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb): Learn how to load a labeled image classification dataset and analyze for potential issues. If you have labeled ImageNet-style folder structure, have a go! # + ๐ŸŽ [**Analyze Object Detection Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb): Learn how to load bounding box annotations for object detection and analyze for potential issues. If you have a COCO-style labeled object detection dataset, give this example a try. # # # ## VL Profiler # If you prefer a no-code platform to inspect and visualize your dataset, [**try our free cloud product VL Profiler**](https://app.visual-layer.com) - VL Profiler is our first no-code commercial product that lets you visualize and inspect your dataset in your browser. # # [Sign up](https://app.visual-layer.com) now, it's free. # # [![image](https://raw.githubusercontent.com/visual-layer/fastdup/main/gallery/vl_profiler_promo.svg)](https://app.visual-layer.com) # # As usual, feedback is welcome! # # Questions? Drop by our [Slack channel](https://visualdatabase.slack.com/join/shared_invite/zt-19jaydbjn-lNDEDkgvSI1QwbTXSY6dlA#/shared-invite/email) or open an issue on [GitHub](https://github.com/visual-layer/fastdup/issues).