INSTITUTIONAL DIGITAL REPOSITORY

Deep generative architectures for image inpainting

Show simple item record

dc.contributor.author Phutke, S.
dc.date.accessioned 2025-09-09T09:44:57Z
dc.date.available 2025-09-09T09:44:57Z
dc.date.issued 2022-12-19
dc.identifier.uri http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4770
dc.description.abstract Image inpainting is a reconstruction method, where a corrupted image consisting of holes is lled with the most relevant contents from the valid region of same image. With the advancements in image editing applications, image inpainting is gaining more attention due to its ability to recover corrupted images e ciently. Also, it has a wide variety of applications such as reconstruction of the corrupted image, occlusion removal, re ection removal, etc. Existing approaches achieved superior performance with coarse-to- ne, single-stage, progressive, and recurrent architectures with a compromise of either perceptual quality (blurry, spatial inconsistencies) of results or computational complexity. Also, the performance of the existing methods degrades when images with large missing regions are considered. In order to mitigate these limitations, in this work, we propose the deep generative architectures for image inpainting. Firstly, we propose the coarse-to- ne architectures for inpainting images with varying corrupted regions with improved performance as compared to state-of-the-art methods. The three proposed coarse-to- ne solutions consist of: (a) a spatial projection layer to focus on spatial consistencies in the inpainted image, (b) encoder-level feature aggregation followed by multi-scale and multi-receptive feature sharing decoder, and (c) a nested deformable multi-head attention layer to e ectively merge the encoder-decoder features. Further, to reduce the computational complexity, we proposed single-stage architectures with three solutions as: (a) a correlated multi-resolution feature fusion, (b) diverse-receptive elds based feature learning, and (c) pseudo-decoder guided reconstruction for image inpainting. The proposed architectures have less computational complexity compared to earlier one and state-of-the-art methods for image inpainting. The performance of these proposed architectures is validated in terms of qualitative, quantitative results and computational complexity in comparison with each other and existing methods for image inpainting. Furthermore, to reduce the mask dependency of the proposed and existing approaches, we propose two novel blind image inpainting approaches consisting of (a) wavelet query multi-head attention transformer and omni-dimensional gated attention (b) high receptive elds (multi-kernel) multi-head attention and novel high-frequency o set deformable feature merging module. These proposed approaches is compared qualitatively and quantitatively with existing state-of-the-art methods for blind image inpainting. To validate the performance of the proposed architectures, the experimental analysis is done on di erent datasets like: CelebA-HQ, FFHQ, Paris Street View, Places2 and Imagenet. en_US
dc.language.iso en_US en_US
dc.subject Feature Aggregation en_US
dc.subject Spatial Projections en_US
dc.subject Multi-head Attention en_US
dc.subject Diverse Receptive Fields en_US
dc.subject Image Inpainting en_US
dc.subject Blind Image Inpainting en_US
dc.title Deep generative architectures for image inpainting en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account