Abstract:
Image inpainting is now-a-days sought after due to its wide variety of applications in the reconstruction
of the corrupted image, occlusion removal, reflection removal, etc. Existing image inpainting approaches
utilize different types of attention mechanisms to inpaint the image and produce visibly admirable results. These methods are more concerned at weighing the feature maps of the hole region with some
weight from the non-hole region. But, due to the lack of spatial contextual correlation in the attention maps, the inpainted image may suffer from the inconsistencies among hole and non-hole regions.
Transformer-based inpainting methods give significant results by capturing the relationship between the
patches with a compromise of high computational complexity. In this context, we propose a novel spatial projection layer (SPL) without any attention mechanism to project the spatial contextual information
in the hole region from non-hole regions for producing a spatially plausible inpainted image. The SPL
is proposed mainly to focus on the non-hole spatial information in the high-level feature maps for filling the hole regions efficiently. Also, while training the network, we propose the use of edge loss with
a Canny edge operator for image inpainting to focus on the relevant edges instead of noise contents.
Analysis with the extensive experiments, ablation, and user study on the proposed architecture demonstrates the superiority over existing state-of-the-art methods for image inpainting.