#!/usr/bin/env python
# coding: utf-8

# [![image](https://raw.githubusercontent.com/visual-layer/visuallayer/main/imgs/vl_horizontal_logo.png)](https://www.visual-layer.com)

# # Face Detection from Videos
# 
# [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/video-face-detection.ipynb)
# [![Open in Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/fastdup/blob/main/examples/video-face-detection.ipynb)
# 
# In this tutorial, we will use fastdup with a face detection model to detect and crop from videos. Following that we analyze the cropped faces for issues such as duplicates, near-duplicates, outliers, bright/dark/blurry faces.

# ## Installation & Setting Up

# In[ ]:


get_ipython().system('pip install fastdup kaggle -Uq')


# In[1]:


import fastdup
fastdup.__version__


# ## Download & Extract Dataset

# Let's download a Tiktok [trending video dataset](https://www.kaggle.com/datasets/erikvdven/tiktok-trending-december-2020) from Kaggle. The dataset consists of the first 1000 trending videos scraped from TikTok on December 2020.
# 
# You can download the dataset by manually by heading to the dataset [homepage](https://www.kaggle.com/datasets/erikvdven/tiktok-trending-december-2020) or using the [Kaggle API](https://github.com/Kaggle/kaggle-api). 
# 
# Let's use the Kaggle API to download the dataset:

# In[ ]:


get_ipython().system('kaggle datasets download -d erikvdven/tiktok-trending-december-2020')


# Unzip the dataset into a folder called `data`.

# In[2]:


get_ipython().system('unzip -q tiktok-trending-december-2020 -d data')


# ## Video to Images
# 
# fastdup works on images. We must first turn the videos into frames of images.
# 
# We can use a one-liner fastdup utility function to turn all the videos in a folder into frames:

# In[3]:


fastdup.extract_video_frames(input_dir="data", work_dir="frames")


# ## Run fastdup
# 
# Now that we have the frames of images, let's run fastdup and analyze the frames.

# In[4]:


fd = fastdup.create(input_dir='frames')


# In[5]:


fd.run(bounding_box='face')


# ## Components Gallery
# 
# We can visualize the cluster of similar detections using the components gallery view. Specify `draw_bbox=True` to see the detection bounding box on the original image.

# In[6]:


fd.vis.component_gallery(draw_bbox=True)


# If you'd like to view just the cropped bounding box images, specify `draw_bbox=False`

# In[7]:


fd.vis.component_gallery(draw_bbox=False)


# ## Find Similar Faces Across Videos
# 
# Using the `similarity_gallery` view, we can find similar looking faces (bounding boxes) across all the extracted frames.

# In[8]:


fd.vis.similarity_gallery(draw_bbox=False)


# ## Find Outliers
# 
# Useing the `outliers_gallery` we can also viaualize faces (detections) that looks visually different from others.

# In[9]:


fd.vis.outliers_gallery()


# ## Duplicate Faces
# 
# With the `duplicates_gallery` view, visualize duplicate image pairs across videos.

# In[11]:


fd.vis.duplicates_gallery()


# ## Dark Faces
# 
# Using the `stats_gallery` view, we can sort the faces (detections) following a desired `metric` such as 'dark', 'bright' and 'blur'. 

# In[13]:


fd.vis.stats_gallery(metric='dark')


# ## Bright Faces

# In[14]:


fd.vis.stats_gallery(metric='bright')


# ## Blurry Faces

# In[15]:


fd.vis.stats_gallery(metric='blur')


# ## Wrap Up
# 
# Next, feel free to check out other tutorials -
# 
# + ⚡ [**Quickstart**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/quick-dataset-analysis.ipynb): Learn how to install fastdup, load a dataset and analyze it for potential issues such as duplicates/near-duplicates, broken images, outliers, dark/bright/blurry images, and view visually similar image clusters. If you're new, start here!
# + 🧹 [**Clean Image Folder**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/cleaning-image-dataset.ipynb): Learn how to analyze and clean a folder of images from potential issues and export a list of problematic files for further action. If you have an unorganized folder of images, this is a good place to start.
# + 🖼 [**Analyze Image Classification Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb): Learn how to load a labeled image classification dataset and analyze for potential issues. If you have labeled ImageNet-style folder structure, have a go!
# + 🎁 [**Analyze Object Detection Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb): Learn how to load bounding box annotations for object detection and analyze for potential issues. If you have a COCO-style labeled object detection dataset, give this example a try. 
# 

# 
# ## VL Profiler
# If you prefer a no-code platform to inspect and visualize your dataset, [**try our free cloud product VL Profiler**](https://app.visual-layer.com) - VL Profiler is our first no-code commercial product that lets you visualize and inspect your dataset in your browser. 
# 
# [Sign up](https://app.visual-layer.com) now, it's free.
# 
# [![image](https://raw.githubusercontent.com/visual-layer/fastdup/main/gallery/vl_profiler_promo.svg)](https://app.visual-layer.com)
# 
# As usual, feedback is welcome! 
# 
# Questions? Drop by our [Slack channel](https://visualdatabase.slack.com/join/shared_invite/zt-19jaydbjn-lNDEDkgvSI1QwbTXSY6dlA#/shared-invite/email) or open an issue on [GitHub](https://github.com/visual-layer/fastdup/issues).