PERFORMANCE EVALUATION OF AUTOMATED MEETING SUMMARIZATION BASED ON OPENAI WHISPER  AND INDOT5 FINE-TUNING

I Gusti  Lanang Oka Wiyana; Putu  Indah Ciptayani; Ida Bagus  Adisimakrisna Peling

doi:10.33330/jurteksi.v12i3.4581

I Gusti Lanang Oka Wiyana Politeknik Negeri Bali
Putu Indah Ciptayani Politeknik Negeri Bali
Ida Bagus Adisimakrisna Peling Politeknik Negeri Bali

DOI: https://doi.org/10.33330/jurteksi.v12i3.4581

Keywords: abstractive summarization, ASR, end-to-end pipeline, IndoT5, real-time factor

Abstract

Abstract: Manual meeting documentation risks losing important information due to cognitive fatigue. Although automated summarization models have evolved, integrated end-to-end systems for Indonesian spoken language remain highly limited. This study aims to design and evaluate an end-to-end automated meeting summarization architecture that directly integrates Automatic Speech Recognition (ASR) via OpenAI Whisper for transcription and the IndoT5 language model for abstractive summarization. IndoT5 was fine-tuned using a dataset of 486 Indonesian spoken language transcript pairs. Testing was conducted on a CPU infrastructure using MP4, MP3, and WAV formats. Results show the optimal fine-tuning configuration significantly improved accuracy, achieving ROUGE-1 (0.4167), ROUGE-2 (0.1973), and ROUGE-L (0.2701) scores. Computationally, the system achieved a Real-Time Factor below 1, processing data faster than the actual recording duration. Conclusively, integrating Whisper and IndoT5 shows potential in producing coherent meeting summaries with lightweight computational overhead, making it viable for local infrastructure implementation to ensure data privacy.

Keywords: abstractive summarization; ASR; end-to-end pipeline; IndoT5; real-time factor

Abstrak: Dokumentasi rapat manual rentan menghilangkan informasi penting akibat keterbatasan kognitif. Meskipun model peringkas otomatis telah berkembang, implementasi sistem terintegrasi (end-to-end) khusus percakapan lisan berbahasa Indonesia masih sangat terbatas. Penelitian ini bertujuan merancang dan mengevaluasi arsitektur peringkas rapat otomatis end-to-end yang mengintegrasikan langsung Automatic Speech Recognition (ASR) melalui OpenAI Whisper untuk transkripsi dan model bahasa IndoT5 untuk peringkasan abstraktif. Adaptasi domain dilakukan melalui fine-tuning IndoT5 menggunakan 486 pasang dataset transkrip lisan berbahasa Indonesia. Pengujian pada infrastruktur CPU menggunakan format MP4, MP3, dan WAV. Hasil pengujian menunjukkan konfigurasi fine-tuning optimal berhasil meningkatkan akurasi, dengan skor ROUGE-1 (0,4167), ROUGE-2 (0,1973), dan ROUGE-L (0,2701). Sistem mendemonstrasikan efisiensi komputasi dengan nilai Real-Time Factor di bawah 1, mengindikasikan waktu pemrosesan lebih cepat dari durasi rekaman asli. Kesimpulannya, integrasi Whisper dan IndoT5 menunjukkan potensi dalam menghasilkan ringkasan yang koheren dengan beban komputasi ringan, sehingga layak diimplementasikan pada infrastruktur lokal organisasi untuk menjaga privasi data.

Kata kunci: ASR; end-to-end pipeline; IndoT5; peringkasan abstraktif; real-time factor

References

E. DeFilippis, S. M. Impink, M. Singell, J. T. Polzer, and R. Sadun, “The impact of COVID-19 on dig-ital communication patterns,” Hu-manit. Soc. Sci. Commun., vol. 9, no. 1, Dec. 2022, doi: 10.1057/s41599-022-01190-9.

M. Setyorini, Y. Kartika Sari, and M. K. Ansor, “Pengembangan Sis-tem Manajemen Rapat Dengan Notifikasi Whatsapp Di Polres Tulungagung Menggunakan Kerangka Kerja Scrum,” 2025. [Online]. Available: https://www.jurnal.stkippgritulungagung.ac.id/index.php/joincos

D. M. Hilty et al., “Findings and Guidelines on Provider Technolo-gy, Fatigue, and Well-being: Scop-ing Review,” May 01, 2022, JMIR Publications Inc. doi: 10.2196/34451.

A. P. Widyassari et al., “Review of automatic text summarization techniques & methods,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 4, pp. 1029–1046, Apr. 2022, doi: 10.1016/J.JKSUCI.2020.05.006.

A. Sakti Wiradinata and C. M. Viny, “Abstractive Text Summari-zation Berita Bahasa Indonesia Menggunakan Retrieval-Augmented Generation,” Jurnal Ilmu Komputer dan Sistem Infor-masi, vol. 13, no. 1, 2025, doi: https://doi.org/10.24912/jiksi.v13i1.32861.

N. Giarelis, C. Mastrokostas, and N. Karacapilidis, “Abstractive vs. Extractive Summarization: An Ex-perimental Review,” Jul. 01, 2023, Multidisciplinary Digital Publish-ing Institute (MDPI). doi: 10.3390/app13137620.

D. Jurafsky and J. H. Martin, Speech and Language Processing: An Introduction to Natural Lan-guage Processing, Computational Linguistics, and Speech Recogni-tion with Language Models, 3rd (Draft). Stanford University, 2025. Accessed: Jan. 06, 2026. [Online]. Available: https://web.stanford.edu/~jurafsky/slp3/

A. Radford, J. W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, “Robust Speech Recognition via Large-Scale Weak Supervision,” in International Conference on Machine Learning, Dec. 2022. doi: https://doi.org/10.48550/arXiv.2212.04356.

M. Labied, A. Belangour, M. Ba-nane, and A. Erraissi, “An over-view of Automatic Speech Recog-nition Preprocessing Techniques,” in 2022 International Conference on Decision Aid Sciences and Ap-plications (DASA), IEEE, Mar. 2022, pp. 804–809. doi: 10.1109/DASA54658.2022.9765043.

A. Bahari and K. E. Dewi, “PER-INGKASAN TEKS OTOMATIS ABSTRAKTIF MENGGUNAKAN TRANS-FORMER PADA TEKS BAHA-SA INDONESIA,” Jurnal Ilmiah Komputer dan Informatika, vol. 13, no. 1, 2024.

M. Maurya, M. Zaheer, N. Mo-hammad, S. siddiqui, M. Z. Khan, and M. A. Akram, “Speech Recognition Technologies: Design, Challenges, and Real-World Applications,” International Jour-nal of Innovative Research in Computer Science and Technolo-gy, vol. 13, no. 3, pp. 55–61, May 2025, doi: 10.55524/ijircst.2025.13.3.9.

E. Daraghmi, L. Atwe, and A. Ja-ber, “A Comparative Study of PEGASUS, BART, and T5 for Text Summarization Across Di-verse Datasets,” Future Internet, vol. 17, no. 9, Sep. 2025, doi: 10.3390/fi17090389.

I. G. A. I. U. Putri, I. N. P. Trisna, and N. K. D. Rusjayanthi, “Ab-stractive Text Summarization to Generate Indonesian News High-light Using Transformers Model,” Journal of Information Systems and Informatics, vol. 7, no. 2, pp. 1248–1263, Jun. 2025, doi: 10.51519/journalisi.v7i2.1082.

M. Wahyu Bagus Dwi Satya et al., “Comparative Analysis of T5 Model Performance for Indonesian Abstractive Text Summarization,” Sistemasi: Jurnal Sistem Informasi, vol. 14, no. 3, pp. 2540–9719, 2025, doi: https://doi.org/10.32520/stmsi.v14i3.4884.

S. Lynch, Python for Scientific Computing and Artificial Intelli-gence. Boca Raton: Chapman and Hall/CRC, 2023. doi: 10.1201/9781003285816.

Iswahyudi, D. Hindarto, and H. Santoso, “PyTorch Deep Learning for Food Image Classification with Food Dataset,” sinkron, vol. 7, no. 4, pp. 2651–2661, Oct. 2023, doi: 10.33395/sinkron.v8i4.12987.

M. Fuadi, A. Dharma Wibawa, and S. Sumpeno, “idT5: Indone-sian Version of Multilingual T5 Transformer,” 2023. doi: https://doi.org/10.48550/arXiv.2302.00856.

K. John, D. D. Saurette, and B. Heung, “The problematic case of data leakage: A case for leave-profile-out cross-validation in 3-dimensional digital soil mapping,” Geoderma, vol. 455, p. 117223, Mar. 2025, doi: 10.1016/J.GEODERMA.2025.117223.

M. Ali et al., “A Machine Learning Approach to Reduce Latency in Edge Computing for IoT Devic-es,” Engineering, Technology and Applied Science Research, vol. 14, no. 5, pp. 16751–16756, Oct. 2024, doi: 10.48084/etasr.8365.

H. K. Hameed, “AI-Driven Near-Lossless Audio Compression Modeling via Autoencoders,” Al Rafidain Journal of Engineering Sciences, vol. 3, no. 2, Sep. 2025, doi: 10.61268/c23c6z11.