dc.description.abstract |
Digital images, an extension of human memory is one of the most important information
carrier for human activities. It plays a critical role in many day-to-day applications, from
online social networking to commercial advertising to medical images. On account of
the constraints in the physical characteristics of the digital sensor, e.g. size and density,
the resolution of the captured image is limited. In many cases, the limited resolution
becomes a barrier for fast and accurate analysis. Thus, it is highly desirable to breach
the resolution limitation and acquire high-resolution (HR) digital images. One of the
most promising approach is to utilize signal processing techniques for obtaining an HR
image from Low-resolution (LR) image, and this resolution enhancement approach is called
super-resolution (SR). The major advantage of this software approach is that it costs
much less than upgrading hardware and existing camera systems. Over the past decades,
many pioneers have developed various algorithms to improve the quality of reconstructed
images. More critically, low-resolution images have lesser number of pixels representing
an object of interest, thus making it difficult to find the details. SR targets to solve
this problem, whereby a given LR image is upscaled to retrieve an image with higher
resolution and thus obtain more discernible details that can be employed in downstream
tasks like face recognition, and object classification. The common goal of these techniques
is to provide finer details than the given low-resolution (LR) image by increasing the
number of pixels per unit of space. Additionally, in comparison to DSLR cameras,
low-quality images are generally output from portable devices on account of their physical
limitations. The synthesized low-quality images usually have multiple degradations
low-resolution owing to small camera sensors, mosaic patterns on account of camera filter
array and sub-pixel shifts on account of camera motions. These degradations generally
refrain the performance of single image super-resolution for retrieving high-resolution
(HR) image from a single low-resolution (LR) image. Considering the above points,
the current prevailing deep-learning based super-resolution algorithms often lack in some
aspects: they are highly dependent on designing heavy-weight architectures to achieve
state-of-the-art (SoTA) results and generally do not take into consideration the real-world
degradations. They generally fail in maintaining the balance between spatial details and
contextual information, that is the basic requirement for exhibiting superior performance
in super-resolution task. We also observe that the recent approaches focus more on
feature extraction, without paying much attention to the up-sampling strategies involved.
Moreover, the current approaches fail to leverage the advantages of capturing abundant
information from multiple LR images. Our work focuses on analysing and designing
different solutions for super-resolution task in the context of providing solution to the
above mentioned challenges.
The significant contributions of this work are : (1) A novel approach for generating
contextually enriched outputs by preserving the required information without any sort
of prior information, (2) A novel lightweight approach capable of generating contextually
enriched features for image super-resolution and other applications, (3) A novel framework
for efficiently merging multiple burst LR RAW images in a coherent and effective way to
generate HR RGB outputs with realistic textures and additional high-frequency details,
and (4) Anovel transformer based blind approach for resolving the real-world degradations.
The proposed super-resolution approaches are evaluated on the current SoTA single image
SR databases such as Set5 [1], Set14 [2], BSD100 [3], Urban100 [4], DIV2K [5], Flickr2K
[6], and burst SR databases such as BurstSR [7], and SyntheticBurst dataset [7] and
animated SR database, Manga109 [8]. Also, we evaluate our proposed modules for
DND [9], SIDD [10] for the case of single image denoising and color [11] and gray-scale
[12] datasets for burst denoising. We utilize LoL [13] and MIT [14] datasets for single
image low-light enhancement and SONY dataset for burst low-light enhancement. The
qualitative and quantitative results of proposed methods are examined and compared
with SoTA hand-crafted and learning based methods. Standard quantitative evaluation
parameters such as Structural Similarity Index (SSIM), Peak-to-Signal Ratio (PSNR) and
Learned Perceptual Image Patch Similarity (LPIPS) are used to evaluate the proposed
super-resolution approaches. |
en_US |