Abstract:
Human activity recognition has a significant impact on people’s daily lives. The need to infer human activities is prominent in many human-centric applications, such as healthcare and individual assistance. In this paper, we introduce a non-invasive human activity recognition system that utilizes footstep-induced vibration and sound in an outdoor environment with the aim of achieving improved performance over a single source of information. We employ one-dimensional convolutional neural networks for automated feature extraction, fusion, and activity recognition on a nine-class classification problem. The proposed framework reports an average F1 score of 92%, which corresponds to a 5.74% improvement over the best-performing state-of-the-art. Confusion matrix-based analysis demonstrates that audio-seismic fusion not only reduces misclassifications but also reduces the impact of background noise on model performance. In addition, we demonstrate that a model trained on a balanced dataset has a higher F1 score than one trained on an imbalanced dataset. Activity-wise performance is reported to show the efficacy of the proposed fusion-based framework. We also contribute an audio-seismic dataset for human activity recognition in an outdoor environment. The dataset is collected in a variety of challenging environments, such as varying grass length, soil moisture content, and the passing of unwanted vehicles.