Deep generative architectures for image inpainting

Phutke, S.

DSpace Home
→
Ph.D Theses
→
Year-2022
→
View Item

dc.contributor.author	Phutke, S.
dc.date.accessioned	2025-09-09T09:44:57Z
dc.date.available	2025-09-09T09:44:57Z
dc.date.issued	2022-12-19
dc.identifier.uri	http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4770
dc.description.abstract	Image inpainting is a reconstruction method, where a corrupted image consisting of holes is lled with the most relevant contents from the valid region of same image. With the advancements in image editing applications, image inpainting is gaining more attention due to its ability to recover corrupted images e ciently. Also, it has a wide variety of applications such as reconstruction of the corrupted image, occlusion removal, re ection removal, etc. Existing approaches achieved superior performance with coarse-to- ne, single-stage, progressive, and recurrent architectures with a compromise of either perceptual quality (blurry, spatial inconsistencies) of results or computational complexity. Also, the performance of the existing methods degrades when images with large missing regions are considered. In order to mitigate these limitations, in this work, we propose the deep generative architectures for image inpainting. Firstly, we propose the coarse-to- ne architectures for inpainting images with varying corrupted regions with improved performance as compared to state-of-the-art methods. The three proposed coarse-to- ne solutions consist of: (a) a spatial projection layer to focus on spatial consistencies in the inpainted image, (b) encoder-level feature aggregation followed by multi-scale and multi-receptive feature sharing decoder, and (c) a nested deformable multi-head attention layer to e ectively merge the encoder-decoder features. Further, to reduce the computational complexity, we proposed single-stage architectures with three solutions as: (a) a correlated multi-resolution feature fusion, (b) diverse-receptive elds based feature learning, and (c) pseudo-decoder guided reconstruction for image inpainting. The proposed architectures have less computational complexity compared to earlier one and state-of-the-art methods for image inpainting. The performance of these proposed architectures is validated in terms of qualitative, quantitative results and computational complexity in comparison with each other and existing methods for image inpainting. Furthermore, to reduce the mask dependency of the proposed and existing approaches, we propose two novel blind image inpainting approaches consisting of (a) wavelet query multi-head attention transformer and omni-dimensional gated attention (b) high receptive elds (multi-kernel) multi-head attention and novel high-frequency o set deformable feature merging module. These proposed approaches is compared qualitatively and quantitatively with existing state-of-the-art methods for blind image inpainting. To validate the performance of the proposed architectures, the experimental analysis is done on di erent datasets like: CelebA-HQ, FFHQ, Paris Street View, Places2 and Imagenet.	en_US
dc.language.iso	en_US	en_US
dc.subject	Feature Aggregation	en_US
dc.subject	Spatial Projections	en_US
dc.subject	Multi-head Attention	en_US
dc.subject	Diverse Receptive Fields	en_US
dc.subject	Image Inpainting	en_US
dc.subject	Blind Image Inpainting	en_US
dc.title	Deep generative architectures for image inpainting	en_US
dc.type	Thesis	en_US