Modern trends in deep learning based super- resolution

Mehta, N.

DSpace Home
→
Ph.D Theses
→
Year- 2023
→
View Item

dc.contributor.author	Mehta, N.
dc.date.accessioned	2025-09-09T09:50:23Z
dc.date.available	2025-09-09T09:50:23Z
dc.date.issued	2023-01-23
dc.identifier.uri	http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4771
dc.description.abstract	Digital images, an extension of human memory is one of the most important information carrier for human activities. It plays a critical role in many day-to-day applications, from online social networking to commercial advertising to medical images. On account of the constraints in the physical characteristics of the digital sensor, e.g. size and density, the resolution of the captured image is limited. In many cases, the limited resolution becomes a barrier for fast and accurate analysis. Thus, it is highly desirable to breach the resolution limitation and acquire high-resolution (HR) digital images. One of the most promising approach is to utilize signal processing techniques for obtaining an HR image from Low-resolution (LR) image, and this resolution enhancement approach is called super-resolution (SR). The major advantage of this software approach is that it costs much less than upgrading hardware and existing camera systems. Over the past decades, many pioneers have developed various algorithms to improve the quality of reconstructed images. More critically, low-resolution images have lesser number of pixels representing an object of interest, thus making it difficult to find the details. SR targets to solve this problem, whereby a given LR image is upscaled to retrieve an image with higher resolution and thus obtain more discernible details that can be employed in downstream tasks like face recognition, and object classification. The common goal of these techniques is to provide finer details than the given low-resolution (LR) image by increasing the number of pixels per unit of space. Additionally, in comparison to DSLR cameras, low-quality images are generally output from portable devices on account of their physical limitations. The synthesized low-quality images usually have multiple degradations low-resolution owing to small camera sensors, mosaic patterns on account of camera filter array and sub-pixel shifts on account of camera motions. These degradations generally refrain the performance of single image super-resolution for retrieving high-resolution (HR) image from a single low-resolution (LR) image. Considering the above points, the current prevailing deep-learning based super-resolution algorithms often lack in some aspects: they are highly dependent on designing heavy-weight architectures to achieve state-of-the-art (SoTA) results and generally do not take into consideration the real-world degradations. They generally fail in maintaining the balance between spatial details and contextual information, that is the basic requirement for exhibiting superior performance in super-resolution task. We also observe that the recent approaches focus more on feature extraction, without paying much attention to the up-sampling strategies involved. Moreover, the current approaches fail to leverage the advantages of capturing abundant information from multiple LR images. Our work focuses on analysing and designing different solutions for super-resolution task in the context of providing solution to the above mentioned challenges. The significant contributions of this work are : (1) A novel approach for generating contextually enriched outputs by preserving the required information without any sort of prior information, (2) A novel lightweight approach capable of generating contextually enriched features for image super-resolution and other applications, (3) A novel framework for efficiently merging multiple burst LR RAW images in a coherent and effective way to generate HR RGB outputs with realistic textures and additional high-frequency details, and (4) Anovel transformer based blind approach for resolving the real-world degradations. The proposed super-resolution approaches are evaluated on the current SoTA single image SR databases such as Set5 [1], Set14 [2], BSD100 [3], Urban100 [4], DIV2K [5], Flickr2K [6], and burst SR databases such as BurstSR [7], and SyntheticBurst dataset [7] and animated SR database, Manga109 [8]. Also, we evaluate our proposed modules for DND [9], SIDD [10] for the case of single image denoising and color [11] and gray-scale [12] datasets for burst denoising. We utilize LoL [13] and MIT [14] datasets for single image low-light enhancement and SONY dataset for burst low-light enhancement. The qualitative and quantitative results of proposed methods are examined and compared with SoTA hand-crafted and learning based methods. Standard quantitative evaluation parameters such as Structural Similarity Index (SSIM), Peak-to-Signal Ratio (PSNR) and Learned Perceptual Image Patch Similarity (LPIPS) are used to evaluate the proposed super-resolution approaches.	en_US
dc.language.iso	en_US	en_US
dc.subject	Image Super-Resolution,	en_US
dc.subject	Burst Super-Resolution,	en_US
dc.subject	Blind Super-Resolution	en_US
dc.subject	Multi-Scale Feature Learning	en_US
dc.subject	Frequency Extraction	en_US
dc.subject	Denoising	en_US
dc.title	Modern trends in deep learning based super- resolution	en_US
dc.type	Thesis	en_US