MULTI VIEW FEATURE FUSION FOR INDUSTRIAL ANOMALY DETECTION USING 1D-CNN
Abstract
Abstract: Anomalous sound detection is essential for industrial predictive maintenance, as machine failures often originate from subtle acoustic changes during operation. However, high background noise and limitations of conventional Convolutional Neural Networks (CNN) reduce detection reliability. This study proposes a 1D-CNN-based anomaly detection framework with multi-view feature fusion and temporal segmentation to enhance detection performance. The approach combines MFCC, Log-Mel Spectrogram, and Chroma STFT features, while temporal segmentation divides audio signals into 5-second segments to better capture transient anomalies. Experiments on the MIMII dataset under varying Signal-to-Noise Ratio (SNR) conditions show that MFCC and Log-Mel fusion achieves the best performance, with 97.90% accuracy and ROC-AUC of 0.9789. The model maintains accuracy above 90% at −6 dB, demonstrating strong robustness in noisy industrial environments.
Keywords: industrial anomaly detection; 1D-CNN; multi-view feature fusion; temporal segmentation; MIMII dataset.
Abstrak: Deteksi anomali suara merupakan komponen penting dalam sistem pemeliharaan prediktif industri, karena kegagalan mesin sering diawali oleh perubahan akustik yang bersifat halus selama proses operasi. Namun, tingkat kebisingan yang tinggi serta keterbatasan arsitektur Convolutional Neural Network (CNN) konvensional dapat menurunkan keandalan deteksi. Penelitian ini bertujuan mengusulkan kerangka deteksi anomali berbasis 1D-CNN yang mengintegrasikan strategi fusi fitur multi-view dan segmentasi temporal untuk meningkatkan kinerja deteksi. Pendekatan yang digunakan menggabungkan fitur MFCC, Log-Mel Spectrogram dan Chroma STFT, sementara teknik temporal splitting membagi sinyal audio menjadi segmen berdurasi 5 detik untuk menangkap anomali yang bersifat sementara. Eksperimen menggunakan dataset MIMII pada berbagai kondisi Signal-to-Noise Ratio (SNR) menunjukkan bahwa kombinasi MFCC dan Log-Mel Spectrogram menghasilkan kinerja terbaik dengan akurasi 97,90% dan ROC-AUC sebesar 0,9789. Model juga mempertahankan akurasi di atas 90% pada kondisi kebisingan ekstrem (−6 dB) yang menunjukkan ketahanan yang baik dalam lingkungan industri yang bising.
Kata kunci: deteksi anomali industri; 1D-CNN; fusi fitur multi-view; segmentasi temporal; dataset MIMII
References
Unido, “Industrial Development Report 2024 - Turning Challenges Into Sustainable Solutions: The New Era of Industrial Policy,” p. 35, 2024.
S. I. Monye, S. A. Afolalu, S. L. Lawal, O. A. Oluwatoyin, and A. G. Adeyemi, “Overview and Im-pact of Maintenance Process in 4th Industrial Revolution,” E3S Web of Conferences, vol. 430, pp. 1–12, 2023, doi: 10.1051/e3sconf/202343001220.
M. E. Del Giudice, M. Sharaf-khani, M. Di Nardo, T. Murino, and M. C. Leva, “Exploring Safety of Machineries and Training: An Overview of Current Literature Applied to Manufacturing Envi-ronments,” Processes, vol. 12, no. 4, 2024, doi: 10.3390/pr12040684.
I. Rojek, M. Jasiulewicz-Kaczmarek, M. Piechowski, and D. Mikołajewski, “An Artificial Intelligence Approach for Improv-ing Maintenance to Supervise Ma-chine Failures and Support Their Repair,” Applied Sciences (Swit-zerland), vol. 13, no. 8, 2023, doi: 10.3390/app13084971.
M. Molęda, B. Małysiak-Mrozek, W. Ding, V. Sunderam, and D. Mrozek, “From Corrective to Pre-dictive Maintenance—A Review of Maintenance Approaches for the Power Industry,” Sensors, vol. 23, no. 13, 2023, doi: 10.3390/s23135970.
N. F. M. Hafiz, S. Mashohor, M. H. S. E. M. A. Shazril, A. M. Ali, and M. F. A. Rasid, “Machine Learning Framework for Industrial Machine Sound Classification in Predictive Maintenance,” IEEE Access, vol. 13, no. August, pp. 154960–154975, 2025, doi: 10.1109/ACCESS.2025.3601999.
A. Senanayaka, P. Lee, N. Lee, C. Dickerson, A. Netchaev, and S. Mun, “Enhancing the accuracy of machinery fault diagnosis through fault source isolation of complex mixture of industrial sound sig-nals,” International Journal of Ad-vanced Manufacturing Technology, vol. 133, no. 11–12, pp. 5627–5642, 2024, doi: 10.1007/s00170-024-14080-y.
F. A. ERDOĞAN, A. KÜÇÜK-MANİSA, and Z. H. KİLİMCİ, “Detection of Fault from Acoustic Signals in Automobile Engines us-ing Deep Learning Techniques,” Kocaeli Journal of Science and Engineering, vol. 6, no. 2, pp. 148–154, 2023, doi: 10.34088/kojose.1225591.
M. Romanssini, P. C. C. de Aguir-re, L. Compassi-Severo, and A. G. Girardi, “A Review on Vibration Monitoring Techniques for Predic-tive Maintenance of Rotating Ma-chinery,” Eng, vol. 4, no. 3, pp. 1797–1817, 2023, doi: 10.3390/eng4030102.
S. Ding, S. Zhang, and C. Yang, “Machine tool fault classification diagnosis based on audio parame-ters,” Results in Engineering, vol. 19, no. July, p. 101308, 2023, doi: 10.1016/j.rineng.2023.101308.
M. K. Gourisaria, R. Agrawal, M. Sahni, and P. K. Singh, “Compara-tive analysis of audio classification with MFCC and STFT features us-ing machine learning techniques,” Discover Internet of Things, vol. 4, no. 1, 2024, doi: 10.1007/s43926-023-00049-y.
T. T. H. Le, A. A. Adiputra, J. Yun, and H. Kim, “Anomaly De-tection in Industrial Machine Sounds Using High-Frequency Features and Gate Recurrent Unit Networks,” IEEE Access, vol. 13, no. May, pp. 77165–77186, 2025, doi: 10.1109/ACCESS.2025.3565812.
F. Joanda Kaunang, A. Pramana Thenata, B. Hakim, D. Fernando Nainggolan, P. Hiskiawan, and Ranny, “Sound Engine Based In-Situ Environment Leveraging Neural Network Classification Al-gorithm,” in 2025 IEEE Interna-tional Conference on Artificial In-telligence for Learning and Opti-mization (ICoAILO), 2025, pp. 352–358. doi: 10.1109/ICoAILO66760.2025.11156048.
P. Hiskiawan, S. A. Yasodhara, and D. Alexander, “Mel-Frequency Cepstral Coefficients and Neural Networks for Indone-sian Traditional Music Recogni-tion,” in 2025 International Con-ference on Informatics, Multime-dia, Cyber and Information System (ICIMCIS), 2025, pp. 1707–1712.
M. T. Htun, “Compact and Robust MFCC-based Space-Saving Audio Fingerprint Extraction for Efficient Music Identification on FM Broadcast Monitoring,” Journal of ICT Research and Applications, vol. 16, no. 3, pp. 226–242, Dec. 2022, doi: 10.5614/itbj.ict.res.appl.2022.16.3.3.
R. Artikel et al., “Dilated-Convolutional Recurrent Neural Network untuk Klasifikasi Genre Musik Creative Commons,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 10, 2024, doi: 10.28932/jutisi.v10i3.9347.
G. Yoo, S. Hong, and H. Kim, “Emotion Recognition and Multi-class Classification in Music with MFCC and Machine Learning,” International Journal on Ada-vanced Science Engineering In-formation Technology, vol. 14, no. 3, 2024, [Online]. Available: https://www.kaggle.com/
F. Mahardhika, M. L. Haryanti, and P. Hiskiawan, “Performance Evaluation of Speech Emotion Recognition Using Hybrid Feature Selection and Machine Learning,” in 2025 4th International Confer-ence on Creative Communication and Innovative Technology (IC-CIT), 2025, pp. 1–7. doi: 10.1109/ICCIT65724.2025.11166879.
A. Alamsyah, F. Ardiansyah, and A. Kholiq, “Music Genre Classifi-cation Using Mel Frequency Cepstral Coefficients and Artificial Neural Networks: A Novel Approach,” Scientific Journal of Informatics, vol. 11, no. 4, pp. 937–948, Dec. 2024, doi: 10.15294/sji.v11i4.13660.
R. Refianti and F. Mahardi, “Comparison of Music Genre Classification Results Using Mul-tilayer Perceptron With Chroma Feature and Mel Frequency Cepstral Coefficients Extraction Features,” International Journal of Engineering, Science and Infor-mation Technology, vol. 3, no. 2, pp. 53–59, 2023, doi: 10.52088/ijesty.v1i4.444.
P. Hiskiawan, J. William, L. Fe-liepe, and T. Jansel, “A Hybrid Data Science Framework for Fore-casting Bitcoin Prices using Tradi-tional and AI Models,” Journal of Applied Informatics and Compu-ting, vol. 9, no. 5, pp. 2089–2101, 2025.
M. R. Shadi, H. Mirshekali, and H. R. Shaker, “Explainable artificial intelligence for energy systems maintenance: A review on con-cepts, current techniques, chal-lenges, and prospects,” Jul. 01, 2025, Elsevier Ltd. doi: 10.1016/j.rser.2025.115668.
P. Hiskiawan, C. Chih, C. Zheng, and K. Ye, “Processing of electri-cal resistivity tomography data us-ing convolutional neural network in ERT NET architectures,” Ara-bian Journal of Geosciences, pp. 1–14, 2023, doi: 10.1007/s12517-023-11690-w.
L. Deng, T. Yin, Z. Li, and Q. Ge, “Analysis of the Effectiveness of CNN-LSTM Models Incorporating Bert and Attention Mechanisms in Sentiment Analysis of Data Re-views,” ICBDIE, pp. 821–829, 2023, doi: 10.2991/978-94-6463-238-5_106.








