Deep generative architectures for image inpainting

Phutke, S.

Please use this identifier to cite or link to this item: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4770

Title:	Deep generative architectures for image inpainting
Authors:	Phutke, S.
Keywords:	Feature Aggregation Spatial Projections Multi-head Attention Diverse Receptive Fields Image Inpainting Blind Image Inpainting
Issue Date:	19-Dec-2022
Abstract:	Image inpainting is a reconstruction method, where a corrupted image consisting of holes is lled with the most relevant contents from the valid region of same image. With the advancements in image editing applications, image inpainting is gaining more attention due to its ability to recover corrupted images e ciently. Also, it has a wide variety of applications such as reconstruction of the corrupted image, occlusion removal, re ection removal, etc. Existing approaches achieved superior performance with coarse-to- ne, single-stage, progressive, and recurrent architectures with a compromise of either perceptual quality (blurry, spatial inconsistencies) of results or computational complexity. Also, the performance of the existing methods degrades when images with large missing regions are considered. In order to mitigate these limitations, in this work, we propose the deep generative architectures for image inpainting. Firstly, we propose the coarse-to- ne architectures for inpainting images with varying corrupted regions with improved performance as compared to state-of-the-art methods. The three proposed coarse-to- ne solutions consist of: (a) a spatial projection layer to focus on spatial consistencies in the inpainted image, (b) encoder-level feature aggregation followed by multi-scale and multi-receptive feature sharing decoder, and (c) a nested deformable multi-head attention layer to e ectively merge the encoder-decoder features. Further, to reduce the computational complexity, we proposed single-stage architectures with three solutions as: (a) a correlated multi-resolution feature fusion, (b) diverse-receptive elds based feature learning, and (c) pseudo-decoder guided reconstruction for image inpainting. The proposed architectures have less computational complexity compared to earlier one and state-of-the-art methods for image inpainting. The performance of these proposed architectures is validated in terms of qualitative, quantitative results and computational complexity in comparison with each other and existing methods for image inpainting. Furthermore, to reduce the mask dependency of the proposed and existing approaches, we propose two novel blind image inpainting approaches consisting of (a) wavelet query multi-head attention transformer and omni-dimensional gated attention (b) high receptive elds (multi-kernel) multi-head attention and novel high-frequency o set deformable feature merging module. These proposed approaches is compared qualitatively and quantitatively with existing state-of-the-art methods for blind image inpainting. To validate the performance of the proposed architectures, the experimental analysis is done on di erent datasets like: CelebA-HQ, FFHQ, Paris Street View, Places2 and Imagenet.
URI:	http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4770
Appears in Collections:	Year-2022

Files in This Item:

File	Description	Size	Format
Full_text.pdf.pdf		111.73 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets