OPTIMIZING RETRIEVAL-AUGMENTED GENERATION FOR DOMAIN-SPECIFIC KNOWLEDGE SYSTEMS THROUGH FINE-TUNING AND PROMPT ENGINEERING

  • Ahmad Fajri President University
  • Rila Mandala Institut Teknologi Bandung
Keywords: embedding fine-tuning, large language model, prompt engineering, prompt injection mitigation;, retrieval-augmented generation

Abstract

Abstract: This study discusses the optimization of RAG for a FAQ system in the field of information technology product security certification at BSSN. Although LLM generate reliable responses, they often lack up-to-date and domain-specific knowledge, which can be addressed through the RAG approach. This research aims to optimize a domain-specific RAG system by improving embedding performance, enhancing prompt robustness, and increasing retrieval accuracy. The research methods consist of three stages. The first stage involves fine-tuning the bge-m3 embedding model and evaluating its performance using MRR, Recall, and AUC. The second stage applies prompt engineering techniques, namely the SRSM and Autodefense, to mitigate direct-injection and escape-character prompt injection attacks. The third stage evaluates the proposed RAG system using Precision, Recall, and F1-Score metrics against four baseline models. The results of research show that the fine-tuned embedding model achieves higher performance than the original model, with MRR@1 and Recall@1 values of 0.80 and an AUC@100 of 0.7023. In addition, the proposed prompt engineering techniques demonstrate robustness against prompt injection attacks, while the overall RAG system attains a perfect Precision, Recall, and F1-Score of 1.00. In conclusion, the proposed approach effectively enhances retrieval accuracy, embedding quality, and system security, resulting in a more reliable RAG-based FAQ system for information technology product security certification.

Keywords: embedding fine-tuning; large language model; prompt engineering; prompt injection mitigation; retrieval-augmented generation

 

Abstrak: Studi ini membahas optimasi RAG untuk sistem FAQ di bidang sertifikasi keamanan produk teknologi informasi di BSSN. Meskipun LLM menghasilkan respons yang andal, mereka seringkali kurang memiliki pengetahuan terkini dan spesifik domain, yang dapat diatasi melalui pendekatan RAG. Penelitian ini bertujuan untuk mengoptimalkan sistem RAG spesifik domain dengan meningkatkan kinerja embedding, meningkatkan ketahanan prompt dan meningkatkan akurasi pengambilan. Metode penelitian terdiri dari tiga tahap. Tahap pertama melibatkan fine-tuning model embedding bge-m3 dan mengevaluasi kinerjanya menggunakan Mean Reciprocal Rank (MRR), Recall, dan AUC. Tahap kedua menerapkan teknik rekayasa prompt, yaitu Self- SRSM dan Autodefense, untuk mengurangi serangan direct-injection dan escape-character prompt injection. Tahap ketiga mengevaluasi sistem RAG yang diusulkan menggunakan metrik Presisi, Recall, dan F1-Score terhadap empat model dasar. Hasil penelitian menunjukkan bahwa model embedding yang disempurnakan mencapai kinerja yang lebih tinggi daripada model asli, dengan nilai MRR@1 dan Recall@1 sebesar 0,80 dan AUC@100 sebesar 0,7023. Selain itu, teknik rekayasa prompt yang diusulkan menunjukkan ketahanan terhadap serangan injeksi prompt, sementara sistem RAG secara keseluruhan mencapai Presisi, Recall, dan F1-Score sempurna sebesar 1,00. Kesimpulannya, pendekatan yang diusulkan secara efektif meningkatkan akurasi pengambilan, kualitas embedding dan keamanan sistem, menghasilkan sistem FAQ berbasis RAG yang lebih andal untuk sertifikasi keamanan produk teknologi informasi.

Kata kunci: penyempurnaan embedding; model bahasa besar; rekayasa prompt; mitigasi injeksi prompt; retrieval-augmented generation

References

M. A. K. Raiaan et al., “A Review on Large Language Models: Archi-tectures, Applications, Taxonomies, Open Issues and Challenges,” IEEE Access, vol. 12, pp. 26839–26874, 2024.

Y. Shen et al., “ChatGPT and Other Large Language Models Are Dou-ble-edged Swords,” Apr. 01, 2023, Radiological Society of North America Inc.

W. Zhu et al., “Multilingual Ma-chine Translation with Large Lan-guage Models: Empirical Results and Analysis.”

T. Zhang, F. Ladhak, E. Durmus, P. Liang, K. Mckeown, and T. B. Hashimoto, “Benchmarking Large Language Models for News Sum-marization”.

Y. Wang, R. Ren, J. Li, W. X. Zhao, J. Liu, and J.-R. Wen, “REAR: A Relevance-Aware Re-trieval-Augmented Framework for Open-Domain Question Answer-ing.”

I. K. Raharjana, D. Siahaan, and C. Fatichah, “User Stories and Natural Language Processing: A Systematic Literature Review,” IEEE Access, vol. 9, pp. 53811–53826, 2021.

X. Ji, L. Xu, L. Gu, J. Ma, Z. Zhang, and W. Jiang, “RAP-RAG: A Retrieval-Augmented Generation Framework with Adaptive Retrieval Task Planning,” Electronics (Swit-zerland), vol. 14, no. 21, Nov. 2025.

K. Muludi, K. Milani Fitria, and J. Triloka, “Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model,” 2024.

K. Mao, Z. Liu, H. Qian, F. Mo, C. Deng, and Z. Dou, “RAG-Studio: Towards In-Domain Adaptation of Retrieval Augmented Generation Through Self-Alignment.”

X. Kehan, Z. Kun, L. Jingyuan, and W. Yuanzhuo, “CRP-RAG: A Re-trieval-Augmented Generation Framework for Supporting Com-plex Logical Reasoning and Knowledge Planning,” Electronics (Switzerland), vol. 14, no. 1, Jan. 2025.

P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.”

J. Tenghao, “FAQ question Answering method based on semantic similarity matching,” in Proceedings - 2022 6th International Symposium on Computer Science and Intelligent Control, ISCSIC 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 93–100.

Y. H. Chang, Y. T. Guo, L. C. Fu, M. J. Chiu, H. M. Chiu, and H. J. Lin, “Interactive Healthcare Robot Using Attention-Based Question-Answer Retrieval and Medical Enti-ty Extraction Models,” IEEE J Bio-med Health Inform, vol. 27, no. 12, pp. 6039–6050, Dec. 2023.

P. Chauhan, R. K. Sahani, S. Datta, A. Qadir, M. Raj, and M. M. Ali, “Evaluating Top-k RAG-based ap-proach for Game Review Genera-tion,” in Proceedings - In-ternational Conference on Computing, Power, and Communication Technologies, IC2PCT 2024, Institute of Electrical and Electronics Engineers Inc., 2024, pp. 258–263.

H. Chia, A. I. Oliveira, and P. Azevedo, “Implementation of an intelligent virtual assistant based on LLM models for irrigation optimi-zation,” in Proceedings - 8th Inter-national Young Engineers Forum on Electrical and Computer Engineering, YEF-ECE 2024, Institute of Electrical and Elec-tronics Engineers Inc., 2024, pp. 94–100.

A. A. Khan, M. T. Hasan, K. K. Kemell, J. Rasku, and P. Abra-hamsson, “Developing Retrieval Augmented Generation (RAG) based LLM Systems from PDFs: An Experience Report,” Oct. 2024.

Published
2025-12-31
How to Cite
Ahmad Fajri, & Rila Mandala. (2025). OPTIMIZING RETRIEVAL-AUGMENTED GENERATION FOR DOMAIN-SPECIFIC KNOWLEDGE SYSTEMS THROUGH FINE-TUNING AND PROMPT ENGINEERING. JURTEKSI (jurnal Teknologi Dan Sistem Informasi), 12(1), 137 - 144. https://doi.org/10.33330/jurteksi.v12i1.4338