Please use this identifier to cite or link to this item: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/1978
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChugh, K.
dc.contributor.authorDhall, A.
dc.contributor.authorGupta, P.
dc.contributor.authorSubramanian, R.
dc.date.accessioned2021-07-03T11:38:57Z
dc.date.available2021-07-03T11:38:57Z
dc.date.issued2021-07-03
dc.identifier.urihttp://localhost:8080/xmlui/handle/123456789/1978
dc.description.abstractWe propose detection of deepfake videos based on the dissimilarity between the audio and visual modalities, termed as the Modality Dissonance Score (MDS). We hypothesize that manipulation of either modality will lead to dis-harmony between the two modalities, e.g., loss of lip-sync, unnatural facial and lip movements, etc. MDS is computed as an aggregate of dissimilarity scores between audio and visual segments in a video. Discriminative features are learnt for the audio and visual channels in a chunk-wise manner, employing the cross-entropy loss for individual modalities, and a contrastive loss that models inter-modality similarity. Extensive experiments on the DFDC and DeepFake-TIMIT Datasets show that our approach outperforms the state-of-the-art by up to 7%. We also demonstrate temporal forgery localization, and show how our technique identifies the manipulated video segments.en_US
dc.language.isoen_USen_US
dc.subjectDeepfake detection and localizationen_US
dc.subjectNeural networksen_US
dc.subjectModality dissonanceen_US
dc.subjectContrastive lossen_US
dc.titleNot made for each other– Audio-Visual Dissonance-based deepfake detection and localizationen_US
dc.typeArticleen_US
Appears in Collections:Year-2020

Files in This Item:
File Description SizeFormat 
Fulltext.pdf5.79 MBAdobe PDFView/Open    Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.