Security Detection in Audio Events: A Comparison of Classification Methods

Main Article Content

Alissar Nasser

Abstract

The security of public places is becoming important with the increased rate of violence and subversion. Recently, several types of research have been proposed to automatically detect abnormal behavior in public places like a car crash, violence or other hazardous events in an attempt to improve security and save lives. Furthermore, most of the researches are using supervised classifications techniques to classify the audio signals. This paper proposes the use of the kernel principal component analysis (KPCA) to reduce the number of MFCC features extracted from the audio signal and then apply an unsupervised classification algorithm. Moreover, this paper presents the results of several supervised and unsupervised classification methods for audio events detection and compares these results with the result of the proposed approach. Experiments are done using a real data set recorded at the mean of public transportation. The obtained results reveal that K-means on 2 KPCA components gave good results for triggering a true alarm as well as detecting a false alarm; where the percentages of false and missed alarms were 4.5% and 7.8% respectively; whereas these values were 0.8% and 9.3% respectively for kernel k-means. Notwithstanding the DNN network gave the best results with a false alarm rate of 0% and 1.4% missed alarm.

Keywords:
Audio event detection, MFCC, classification, unsupervised, supervised, kernel PCA, K-means, DNN, kernel Davies and Bouldin index.

Article Details

How to Cite
Nasser, A. (2020). Security Detection in Audio Events: A Comparison of Classification Methods. Journal of Advances in Mathematics and Computer Science, 35(2), 25-41. https://doi.org/10.9734/jamcs/2020/v35i230247
Section
Original Research Article

References

Petridis S, Stafylakis T, Ma P, Cai F, Tzimiropoulos G, Pantic M. End-to-end audiovisual speech recognition. CoRR, abs/1802.06424; 2018.

Cristani M, Bicego M, Murino V. Audio-visual event recognition in surveillance video sequences. IEEE Trans. Multimedia. 2007;9(2):257-67.

Afouras T, Chung J, Senior A, Vinyals O, Zisserman A. Deep audio-visual speech recognition, arXiv:1809.02108v2 [cs.CV]; 2018.

Schindler A, Boyer M, Lindley A, Schreiber D, Philipp T. Large scale audio-visual video analytics platform for forensic investigations of terroristic attacks. Multimedia Modeling. Springer; 2019.
Available:https://doi.org/10.1007/978-3-030-05716-9_9

Rabiner LR, Juang B. Fundamentals on speech recognition. New Jersey: Prentice Hall; 1996.

Almaadeed N, Asim M, Al-Maadeed S, Bouridane A, Beghdadi A. Automatic detection and classification of audio events for road surveillance applications. 2018;18(6):1858.
DOI: 10.3390/s18061858

Roneel V. Sharan, Tom J. Moir. An overview of applications and advancements in automatic sound recognition. Neurocomputing. 2016;22-34.
Available:https://doi.org/10.1016/j.neucom.2016.03.020

Vacher M, Istrate D, Besacier L, Serignat JF, Castelli E. Sound detection and classification for medical telesurvey. ACTA Press, Calgary. 2nd Conference on Biomedical Engineering. Innsbrück, Austria. 2004;395-398.

Pierre Laffitte P, Yun Wang, David Sodoyer A, Laurent Girin. Assessing the performances of different neural network architectures for the detection of screams and shouts in public transportation, expert systems with applications. Elsevier. 2019;117:29–41.
Available:https://doi.org/10.1016/j.eswa.2018.08.052 0957-4174/

Valenzise G, Gerosa L, Tagliasacchi M, Antonacci F, Sarti A. Scream and gunshot detection and localization for audio-surveillance systems. In Proceeding the IEEE AVSS. 2007;21-26.
DOI: 10.1109/AVSS.2007.4425280

Rouas JL, Louradour J, Ambellouis S. Audio events detection in public transport vehicule. IEEE. 2006;733-738.

Ntalampiras S, Potamitis I, Fakotakis N. An adaptive framework for acoustic monitoring of potential hazards. EURASIPJ. Audio Speech Music Process. 2009;13:1-13.

Pasquale Foggia, Nicolai Petkov, Alessia Saggese, Nicola Strisciuglio, Mario Vento. IAPR Fellow “Reliable detection of audio events in highly noisy environments”. Pattern Recognition Letters. Elsevier. 2015; 65:22-28.
Available:http://dx.doi.org/10.1016/j.patrec.2015.06.026

Nasser A, Hamad D, Jean-Luc Rouas J, Ambellouis S. The use of kernel methods for audio events detection. IEEE; 2008.
DOI: 10.1109/ICTTA.2008.4529996

André-Obrecht R. A new statistical approach for automatic speech segmentation. IEEE Transactions on Acoustics, Speech and Signal Processing. 1988;36(1):29-40.

Shölkopf B, Smola AJ. Learning with kernels: Support vector machines, regularization, optimization and beyond. The MIT Press, Cambridge, Massachusetts, London, England; 2002.

Nasser A, Hamad D, Nasr C. Kernel PCA as a visualization tools for clusters identifications. In Lecture Notes in Computer Science. Springer, Berlin, Heidelberg. 2006;4132.

Available:https://doi.org/10.1007/11840930_33

MacQueen JB. Some methods for classification and analysis of multivariate observations, proceedings of 5th Berkeley symposium on mathematical statistics and probability. Berkeley, University of California Press. 1967;1:281-297.

Nasser A, Hébert PA, Hamad D. Clustering evaluation in feature space. In: de Sá JM, Alexandre LA, Duch W, Mandic D, (Eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg. 2007;4669.
Available:https://doi.org/10.1007/978-3-540-74695-9_33

Ganapathy S, Rajan P, Hermansky H. Multi-layer perception based speech activity detection for speaker verification, Published in: 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE; 2011.
DOI: 10.1109/ASPAA.2011.6082323

Sincy V. Thambi, Sreekumar KT, Santhosh Kumar C, Reghu Raj PC. Random forest algorithm for improving the performance of speech/non-speech detection, Published in: First International Conference on Computational Systems and Communications (ICCSC); 2014.
DOI: 10.1109/COMPSC.2014.7032615

Giannakopoulo T, Pikrakis A, Theodoridis S. A multi-class audio classification method with respect to violent content in movies using Bayesian networks. IEEE 9th Workshop on Multimedia Signal Processing; 2007.
DOI: 10.1109/MMSP.2007.4412825

Baby D, Gemmeke JF, Virtanen T, Van Hamme V. Exemplar-based speech enhancement for deep neural network based automatic speech recognition, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2015.
DOI: 10.1109/ICASSP.2015.7178819

Bahuleyan H. Music genre classification using machine learning techniques.
Available:https://arxiv.org/pdf/1804.01149.pdf

Lin KZ, Pwint M. Structuring sport video through audio event classification. In: PCM 2010, Part I, LNCS 6297. Springer. 2010;481–492.

Shawe-Taylor J, Cristianini N. Kernel methods for patter analysis. Cambridge University Press; 2004.

Nasser A. Investigating k-means and kernel k-means algorithms with internal validity indices for cluster identification. Journal of Advances in Mathematics and Computer Science; 2019.
Available:https://doi.org/10.9734/JAMCS/2019/45837

Jenssen R. An information theoretic approach to machine learning. University of Tromso, Thesis; 2005.