Deep Cross Modal Learning for Caricature Verification and Identification(CaVINet)

Garg, J.; Tolani, H.; Peri, S.V.; Krishnan, N.C.

Please use this identifier to cite or link to this item: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/1103

Title:	Deep Cross Modal Learning for Caricature Verification and Identification(CaVINet)
Authors:	Garg, J. Peri, S.V. Tolani, H. Krishnan, N.C.
Keywords:	Cross-modal recognition, Caricature verification and recognition Deep learning
Issue Date:	28-Nov-2018
Abstract:	Learning from different modalities is a challenging task. In this paper, we look at the challenging problem of cross modal face verification and recognition between caricature and visual image modalities. Caricature have exaggerations of facial features of a person. Due to the significant variations in the caricatures, building vision models for recognizing and verifying data from this modality is an extremely challenging task. Visual images with significantly lesser amount of distortions can act as a bridge for the analysis of caricature modality. We introduce a publicly available large Caricature-VIsual dataset [CaVI] with images from both the modalities that captures the rich variations in the caricature of an identity. This paper presents the first cross modal architecture that handles extreme distortions of caricatures using a deep learning network that learns similar representations across the modalities. We use two convolutional networks along with transformations that are subjected to orthogonality constraints to capture the shared and modality specific representations. In contrast to prior research, our approach neither depends on manually extracted facial landmarks for learning the representations, nor on the identities of the person for performing verification. The learned shared representation achieves 91% accuracy for verifying unseen images and 75% accuracy on unseen identities. Further, recognizing the identity in the image by knowledge transfer using a combination of shared and modality specific representations, resulted in an unprecedented performance of 85% rank-1 accuracy for caricatures and 95% rank-1 accuracy for visual images.
URI:	http://localhost:8080/xmlui/handle/123456789/1103
Appears in Collections:	Year-2018

Files in This Item:

File	Description	Size	Format
Full Text.pdf		5.87 MB	Adobe PDF	View/Open Request a copy

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets