An unified recurrent video object segmentation framework for various surveillance environments

Patil, P.W.; Gonde, A.B.; Murala, S.; Dudhane, A.; Gupta, S.; Kulkarni, A.

DSpace Home
→
Research Publications
→
Year-2021
→
View Item

dc.contributor.author	Patil, P.W.
dc.contributor.author	Dudhane, A.
dc.contributor.author	Kulkarni, A.
dc.contributor.author	Murala, S.
dc.contributor.author	Gonde, A.B.
dc.contributor.author	Gupta, S.
dc.date.accessioned	2022-09-03T09:08:59Z
dc.date.available	2022-09-03T09:08:59Z
dc.date.issued	2022-09-03
dc.identifier.uri	http://localhost:8080/xmlui/handle/123456789/3945
dc.description.abstract	Moving object segmentation (MOS) in videos received considerable attention because of its broad security-based applications like robotics, outdoor video surveillance, self-driving cars, etc. The current prevailing algorithms highly depend on additional trained modules for other applications or complicated training procedures or neglect the inter-frame spatio-temporal structural dependencies. To address these issues, a simple, robust, and effective unified recurrent edge aggregation approach is proposed for MOS, in which additional trained modules or fine-tuning on a test video frame(s) are not required. Here, a recurrent edge aggregation module (REAM) is proposed to extract effective foreground relevant features capturing spatio-temporal structural dependencies with encoder and respective decoder features connected recurrently from previous frame. These REAM features are then connected to a decoder through skip connections for comprehensive learning named as temporal information propagation. Further, the motion refinement block with multi-scale dense residual is proposed to combine the features from the optical flow encoder stream and the last REAM module for holistic feature learning. Finally, these holistic features and REAM features are given to the decoder block for segmentation. To guide the decoder block, previous frame output with respective scales is utilized. The different configurations of training-testing techniques are examined to evaluate the performance of the proposed method. Specifically, outdoor videos often suffer from constrained visibility due to different environmental conditions and other small particles in the air that scatter the light in the atmosphere. Thus, comprehensive result analysis is conducted on six benchmark video datasets with different surveillance environments. We demonstrate that the proposed method outperforms the state-of-the-art methods for MOS without any pre-trained module, fine-tuning on the test video frame(s) or complicated training.	en_US
dc.language.iso	en_US	en_US
dc.subject	Adversarial learning	en_US
dc.subject	Recurrent feature sharing	en_US
dc.subject	Spatio-temporal dependencies	en_US
dc.subject	Various surveillance environments	en_US
dc.title	An unified recurrent video object segmentation framework for various surveillance environments	en_US
dc.type	Article	en_US