Please use this identifier to cite or link to this item: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4874
Title: Deep learning based approaches for video motion magnification
Authors: Singh, J.
Keywords: Video Motion Magnification
Knowledge Distillation (KD)
Latency Aware
Motion Manipulation
Neural Architecture Search (NAS)
Contrastive Learning
Issue Date: 21-May-2024
Abstract: Unveiling imperceptible motions in videos is a critical task with applications ranging from industrial monitoring to healthcare diagnostics. However, State-Of-The-Art (SOTA) methods in video motion magnification face challenges that impact their e↵ectiveness. The primary identified problem lies in the trade-o↵s of existing techniques. Hand-crafted bandpass filter-based approaches su↵er from the need for prior information, ringing artifacts, and limited magnification. Conversely, deep learning-based methods, while achieving higher magnification, introduce issues like artificially induced motion, distortions, and computational complexity, making them unsuitable for real-time applications. To overcome these challenges, we propose a comprehensive solution. Our first approach involves a novel deep learning-based lightweight model for motion magnification. Leveraging feature sharing and an appearance encoder, our method enhances motion magnification while minimizing artifacts. Addressing the broader computational complexity challenge associated with SOTA methods, we introduce a Knowledge Distillation-Based Latency-Aware Di↵erentiable Architecture Search method (KL-DNAS). Instead of designing architecture by hand, we let the network decide the best possible architecture under the given constraints. We use a teacher network to search the network by parts using knowledge distillation. Further, search among di↵erent receptive fields and multi-feature connections, are applied for individual layers. Also, we use a novel latency loss is proposed to jointly optimize the target latency constraint and output quality. In the realm of magnifying small motions prone to noise and disturbances, we identify a need for a balanced solution, which can exploit both deep learning and hand-crafted based approaches. Introducing a phase-based deep network operating in both frequency and spatial domains, we generate motion magnification from frequency domain phase f luctuations and refine it spatially. With lightweight models a balance between magnification and computational e ciency is achieved, as evidenced by comparative evaluations against SOTA methods. However, this integration doesn’t fully utilize the steerable pyramid architecture of hand-crafted based methods as it manipulate motion features in single scale. To further enhance it, we integrate traditional techniques with deep learning. The proposed ✓Net model e↵ectively combines handcrafted intuition of complex steerable pyramid with deep learning mechanisms, significantly improving motion magnification performance. Additionally, in response to the sensitivity of video motion magnification to noise-related distortions, we propose a hierarchical magnification network. It produces a more robust performance with a multi-scale manipulator and a novel contrastive learning-based loss. This approach e↵ectively mitigates distortions caused by noise and illumination changes while enhancing texture quality. It maintains a lightweight design, while maintaining its e↵ectiveness.
URI: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4874
Appears in Collections:Year- 2024

Files in This Item:
File Description SizeFormat 
Full_text.pdf.pdf113.83 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.