Abstract:
The all-weather intelligent surveillance system is
the prime challenge for computer vision researchers. The
surveillance is mostly done to analyze the human activity in a
particular region. Several extreme weather conditions like rain,
snow, haze, fog etc. halts the surveillance process and thus
decreases the reliability of the surveillance system. Here, an
attempt is made to tackle one of these weather situation i.e. haze
in case of surveillance. Haze distorts the quality of images and
videos captured by camera. Due to poor quality, it is difficult to
analyze the haze degraded video for the human activities using the
existing state-of-the-art methods for human action recognition
(HAR). Therefore, in this paper, a new two level saliency based
end-to-end network (TSNet) for HAR in hazy videos is proposed.
De-hazing approaches given in [1]–[9] have certain limitations and
therefore we fine-tuned the de-hazing network given in [10] for
HAR. The concept of rank pooling given in [11] is further utilized
to efficiently represent the temporal saliency of the video. The
transmission map information is utilized here to fix the spatial
saliency in each frame. As currently, there is no dataset available
for HAR in hazy video, here a new dataset of hazy video is
generated from two benchmark datasets namely HMDB51 [12]
and UCF101 [13] by adding synthetic haze. The existing methods
for HAR proposed in [11], [14], [15] are applied and compared
with the proposed method on proposed hazy-HMDB51 and hazyUCF101. The proposed method clearly outperforms the above
mentioned methods in terms of average recognition rate (ARR).