DSpace Repository

Studying deep learning models for manipulated face detection

Show simple item record

dc.contributor.author Hüseynli, İlkin
dc.date.accessioned 2022-08-09T12:25:50Z
dc.date.available 2022-08-09T12:25:50Z
dc.date.issued 2021
dc.identifier.uri http://dspace.yildiz.edu.tr/xmlui/handle/1/12968
dc.description Tez (Yüksek Lisans) - Yıldız Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2021 en_US
dc.description.abstract Deepfakes allow users to manipulate the identity of a person in a video or an image. Previously, special hardware and skill were required to create such fake videos/images. But together with improvements on GAN-based techniques, generating more realistic and hard to detect manipulated faces became easier. This threatens individuals and decreases trust in social media platforms. In this work, our goal is to report eight different models’ learning ability on, by far, the largest fake face dataset - DFDC and test the generalization ability of these models with Celeb-DF-v2. Because the training dataset consists of high-quality videos, we started detecting and extracting faces from them. Next, we sampled data to have balanced classes and a feasible amount of data to train with limited resources. We started training with no extra augmentation because the dataset was big enough, and faces were already modified. Next, we added our default augmentation chain, inspired by other works and increased strength with Coarse-Dropout and Grid Mask augmentations. A separate test set from the DFDC dataset, which has unseen augmentations and distractors and a completely different Celeb-DF-v2 dataset, was used to evaluate results. As distinct from the train set, we followed different face extraction flow for the test sets. We issued face tracking by using simple Intersection over the Union and sampled faces that only tracked over a certain number of consecutive faces. For each video in the test set, the confidence of the sampled faces averaged, and a single confidence value was generated. To calculate video-based log loss values, we used this confidence values. For the Celeb-DF-v2 dataset, we also calculated Sensitivity and Specificity values. For these metrics, the optimal threshold was decided by using Equal Error Rate. We concluded that despite the relatively smaller size input EfficientNet-B4 model has the best learning and generalization ability. Training models with half-precision may speed up training time up to 2 times with very few losses. Finally, Coarse Dropout helped models to generalize better. en_US
dc.language.iso en en_US
dc.subject Digital video forensics en_US
dc.subject Face manipulation en_US
dc.subject Deepfake en_US
dc.subject Face swap en_US
dc.title Studying deep learning models for manipulated face detection en_US
dc.type Thesis en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account