Main Article Content
The security of public places is becoming important with the increased rate of violence and subversion. Recently, several types of research have been proposed to automatically detect abnormal behavior in public places like a car crash, violence or other hazardous events in an attempt to improve security and save lives. Furthermore, most of the researches are using supervised classifications techniques to classify the audio signals. This paper proposes the use of the kernel principal component analysis (KPCA) to reduce the number of MFCC features extracted from the audio signal and then apply an unsupervised classification algorithm. Moreover, this paper presents the results of several supervised and unsupervised classification methods for audio events detection and compares these results with the result of the proposed approach. Experiments are done using a real data set recorded at the mean of public transportation. The obtained results reveal that K-means on 2 KPCA components gave good results for triggering a true alarm as well as detecting a false alarm; where the percentages of false and missed alarms were 4.5% and 7.8% respectively; whereas these values were 0.8% and 9.3% respectively for kernel k-means. Notwithstanding the DNN network gave the best results with a false alarm rate of 0% and 1.4% missed alarm.
Cristani M, Bicego M, Murino V. Audio-visual event recognition in surveillance video sequences. IEEE Trans. Multimedia. 2007;9(2):257-67.
Afouras T, Chung J, Senior A, Vinyals O, Zisserman A. Deep audio-visual speech recognition, arXiv:1809.02108v2 [cs.CV]; 2018.
Schindler A, Boyer M, Lindley A, Schreiber D, Philipp T. Large scale audio-visual video analytics platform for forensic investigations of terroristic attacks. Multimedia Modeling. Springer; 2019.
Rabiner LR, Juang B. Fundamentals on speech recognition. New Jersey: Prentice Hall; 1996.
Almaadeed N, Asim M, Al-Maadeed S, Bouridane A, Beghdadi A. Automatic detection and classification of audio events for road surveillance applications. 2018;18(6):1858.
Roneel V. Sharan, Tom J. Moir. An overview of applications and advancements in automatic sound recognition. Neurocomputing. 2016;22-34.
Vacher M, Istrate D, Besacier L, Serignat JF, Castelli E. Sound detection and classification for medical telesurvey. ACTA Press, Calgary. 2nd Conference on Biomedical Engineering. Innsbrück, Austria. 2004;395-398.
Pierre Laffitte P, Yun Wang, David Sodoyer A, Laurent Girin. Assessing the performances of different neural network architectures for the detection of screams and shouts in public transportation, expert systems with applications. Elsevier. 2019;117:29–41.
Valenzise G, Gerosa L, Tagliasacchi M, Antonacci F, Sarti A. Scream and gunshot detection and localization for audio-surveillance systems. In Proceeding the IEEE AVSS. 2007;21-26.
Rouas JL, Louradour J, Ambellouis S. Audio events detection in public transport vehicule. IEEE. 2006;733-738.
Ntalampiras S, Potamitis I, Fakotakis N. An adaptive framework for acoustic monitoring of potential hazards. EURASIPJ. Audio Speech Music Process. 2009;13:1-13.
Pasquale Foggia, Nicolai Petkov, Alessia Saggese, Nicola Strisciuglio, Mario Vento. IAPR Fellow “Reliable detection of audio events in highly noisy environments”. Pattern Recognition Letters. Elsevier. 2015; 65:22-28.
Nasser A, Hamad D, Jean-Luc Rouas J, Ambellouis S. The use of kernel methods for audio events detection. IEEE; 2008.
André-Obrecht R. A new statistical approach for automatic speech segmentation. IEEE Transactions on Acoustics, Speech and Signal Processing. 1988;36(1):29-40.
Shölkopf B, Smola AJ. Learning with kernels: Support vector machines, regularization, optimization and beyond. The MIT Press, Cambridge, Massachusetts, London, England; 2002.
Nasser A, Hamad D, Nasr C. Kernel PCA as a visualization tools for clusters identifications. In Lecture Notes in Computer Science. Springer, Berlin, Heidelberg. 2006;4132.
MacQueen JB. Some methods for classification and analysis of multivariate observations, proceedings of 5th Berkeley symposium on mathematical statistics and probability. Berkeley, University of California Press. 1967;1:281-297.
Nasser A, Hébert PA, Hamad D. Clustering evaluation in feature space. In: de Sá JM, Alexandre LA, Duch W, Mandic D, (Eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg. 2007;4669.
Ganapathy S, Rajan P, Hermansky H. Multi-layer perception based speech activity detection for speaker verification, Published in: 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE; 2011.
Sincy V. Thambi, Sreekumar KT, Santhosh Kumar C, Reghu Raj PC. Random forest algorithm for improving the performance of speech/non-speech detection, Published in: First International Conference on Computational Systems and Communications (ICCSC); 2014.
Giannakopoulo T, Pikrakis A, Theodoridis S. A multi-class audio classification method with respect to violent content in movies using Bayesian networks. IEEE 9th Workshop on Multimedia Signal Processing; 2007.
Baby D, Gemmeke JF, Virtanen T, Van Hamme V. Exemplar-based speech enhancement for deep neural network based automatic speech recognition, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2015.
Bahuleyan H. Music genre classification using machine learning techniques.
Lin KZ, Pwint M. Structuring sport video through audio event classification. In: PCM 2010, Part I, LNCS 6297. Springer. 2010;481–492.
Shawe-Taylor J, Cristianini N. Kernel methods for patter analysis. Cambridge University Press; 2004.
Nasser A. Investigating k-means and kernel k-means algorithms with internal validity indices for cluster identification. Journal of Advances in Mathematics and Computer Science; 2019.
Jenssen R. An information theoretic approach to machine learning. University of Tromso, Thesis; 2005.