Deep learning based approaches for  video motion magnification

Singh, J.

Please use this identifier to cite or link to this item: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4874

Title:	Deep learning based approaches for video motion magnification
Authors:	Singh, J.
Keywords:	Video Motion Magnification Knowledge Distillation (KD) Latency Aware Motion Manipulation Neural Architecture Search (NAS) Contrastive Learning
Issue Date:	21-May-2024
Abstract:	Unveiling imperceptible motions in videos is a critical task with applications ranging from industrial monitoring to healthcare diagnostics. However, State-Of-The-Art (SOTA) methods in video motion magnification face challenges that impact their e↵ectiveness. The primary identified problem lies in the trade-o↵s of existing techniques. Hand-crafted bandpass filter-based approaches su↵er from the need for prior information, ringing artifacts, and limited magnification. Conversely, deep learning-based methods, while achieving higher magnification, introduce issues like artificially induced motion, distortions, and computational complexity, making them unsuitable for real-time applications. To overcome these challenges, we propose a comprehensive solution. Our first approach involves a novel deep learning-based lightweight model for motion magnification. Leveraging feature sharing and an appearance encoder, our method enhances motion magnification while minimizing artifacts. Addressing the broader computational complexity challenge associated with SOTA methods, we introduce a Knowledge Distillation-Based Latency-Aware Di↵erentiable Architecture Search method (KL-DNAS). Instead of designing architecture by hand, we let the network decide the best possible architecture under the given constraints. We use a teacher network to search the network by parts using knowledge distillation. Further, search among di↵erent receptive fields and multi-feature connections, are applied for individual layers. Also, we use a novel latency loss is proposed to jointly optimize the target latency constraint and output quality. In the realm of magnifying small motions prone to noise and disturbances, we identify a need for a balanced solution, which can exploit both deep learning and hand-crafted based approaches. Introducing a phase-based deep network operating in both frequency and spatial domains, we generate motion magnification from frequency domain phase f luctuations and refine it spatially. With lightweight models a balance between magnification and computational e ciency is achieved, as evidenced by comparative evaluations against SOTA methods. However, this integration doesn’t fully utilize the steerable pyramid architecture of hand-crafted based methods as it manipulate motion features in single scale. To further enhance it, we integrate traditional techniques with deep learning. The proposed ✓Net model e↵ectively combines handcrafted intuition of complex steerable pyramid with deep learning mechanisms, significantly improving motion magnification performance. Additionally, in response to the sensitivity of video motion magnification to noise-related distortions, we propose a hierarchical magnification network. It produces a more robust performance with a multi-scale manipulator and a novel contrastive learning-based loss. This approach e↵ectively mitigates distortions caused by noise and illumination changes while enhancing texture quality. It maintains a lightweight design, while maintaining its e↵ectiveness.
URI:	http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4874
Appears in Collections:	Year- 2024

Files in This Item:

File	Description	Size	Format
Full_text.pdf.pdf		113.83 MB	Adobe PDF	View/Open Request a copy

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets