Enhancing RAG Configuration Efficiency for Large Language Models Working with Clinical Guidelines: The Case of Allergic Rhinitis

K.Y. Mokshin, E.V. Bobrova, M.G. Zhabitsky

Abstract


The article examines approaches to improving the effectiveness of Large Language Models (LLMs) in solving specialized medical tasks based on clinical guidelines issued by the Ministry of Health of the Russian Federation. Particular attention is paid to comparing different variations of the Retrieval-Augmented Generation (RAG) architecture, which is used to reduce factual errors and improve the relevance of model responses. Clinical guidelines on allergic rhinitis were selected as the domain-specific case study, and Russian GigaChat family models were used as the language models. Within the experimental study, sixteen configurations of model interaction were implemented, differing in LLM capacity, source document format (PDF and pre-adapted Markdown), and knowledge base retrieval strategies (keyword-based, vector-based, and hybrid search). The quality of responses was evaluated by comparing them with reference answers derived from the clinical guidelines using BLEU and METEOR metrics. The results demonstrated that the best performance is achieved when using a more powerful model in combination with a RAG approach based on vector or hybrid retrieval and a knowledge base constructed from structured, machine-readable text. The findings confirm the feasibility of applying RAG in medical information systems and highlight the importance of prior preparation and structuring of regulatory documents to improve the performance of large language models.


Full Text:

PDF (Russian)

References


Shool S., Adimi S., Amleshi R.S. et al. A systematic review of large language model (LLM) evaluations in clinical medicine. BMC Medical Informatics and Decision Making. 2025. #1(25). P.117.

Pichugov A. A., Namiot D. E., Zubareva E. V. Sovremennye metody obuchenija bol'shih jazykovyh modelej s minimumom dannyh: Ot odnogo primera k absoljutnomu nulju–akademicheskij obzor. International Journal of Open Information Technologies. 2025. #6(13). S.114-124.

Troshina E.A., Zaharova S.M., Cyguleva K.V. i dr. Primenenie iskusstvennogo intellekta v ul'trazvukovoj diagnostike uzlovyh obrazovanij shhitovidnoj zhelezy. Klinicheskaja i jeksperimental'naja tireoidologija. 2024. #1(20). S.15-29.

Marcus H.J., Ramirez P.T., Khan D.Z. et al. The IDEAL framework for surgical robotics: development, comparative evaluation and long-term monitoring. Nature medicine. 2024. #1(30). P.61-75.

Tettey F., Parupelli S.K., Desai S. A Review of Biomedical Devices: Classification, Regulatory Guidelines, Human Factors, Software as a Medical Device, and Cybersecurity. Biomedical Materials & Devices. 2024. #2. P.316–341.

Clusmann J., Kolbinger F.R., Muti H.S. et al. The future landscape of large language models in medicine. Nature Communications medicine. 2023. #3(1). P.141.

Singhal K., Tu T., Gottweis J. et al. Toward expert-level medical question answering with large language models. Nature Medicine. 2025. #31(3). P.943-950.

OpenAI. Introducing ChatGPT Health. URL: https://openai.com/ru-RU/index/introducing-chatgpt-health (data obrashhenija: 01.02.2026)

Andrejchenko A.E., Gusev A.V. Perspektivy primenenija bol'shih jazykovyh modelej v zdravoohranenii. Nacional'noe zdravoohranenie. 2023. #4(4). S.48-55.

Kostrov S.A., Potapov M.P. Bol'shie jazykovye modeli v medicine: aktual'nye jeticheskie vyzovy. Medicinskaja jetika. 2025. #2(13). S.23-34.

Bobrova E.V., Makanov A.Zh., Osnovin S.S. Generacija vrachebnyh zakljuchenij i klassifikacija po Bethesda s ispol'zovaniem glubokogo obuchenija. International Journal of Open Information Technologies. 2023. #10(11). S.119-129.

Busch F., Hoffmann L., Rueger C. et al. Current applications and challenges in large language models for patient care: a systematic review. Nature Communications Medicine. 2025. #5(26). P.1-13.

Church K. W., Chen Z., Ma Y. Emerging trends: A gentle introduction to fine-tuning. Natural Language Engineering. 2021. #6(27). S.763-778.

Savage T., Ma S.P., Boukil A. et al. Fine-Tuning Methods for Large Language Models in Clinical Medicine by Supervised Fine-Tuning and Direct Preference Optimization: Comparative Evaluation. Journal of Medical Internet Research. 2025. #27. P.1-9.

Rangan K., Yin Y. A fine-tuning enhanced RAG system with quantized influence measure as AI judge. Nature Scientific Reports. 2024. #1(14). P.1-17.

Xu L., Xie H., Qin S.J. et al. Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2026. P.1-20.

Montesinos López O. A., Montesinos López A., Crossa J. Overfitting, model tuning, and evaluation of prediction performance. Multivariate statistical machine learning methods for genomic prediction. Springer International Publishing. 2022. P.109-139.

Zhao P., Zhang H., Yu Q. et al. Retrieval-augmented generation for AI-generated content: A survey. Data Science and Engineering. 2026. P.1-29.

Abo El-Enen M., Saad S., Nazmy T. A survey on retrieval-augmentation generation (RAG) models for healthcare applications. Neural Computing and Applications. 2025. #33(37). P.28191-28267.

Taipalus T. Vector database management systems: Fundamental concepts, use-cases, and current challenges. Cognitive Systems Research. 2024. #85. P.1-13.

Neha F., Bhati D., Shukla D.K. Retrieval-augmented generation (rag) in healthcare: A comprehensive review. AI. 2025. #9(6). P.226.

Pingua B., Sahoo A., Kandpal M. et al. Medical LLMs: Fine-Tuning vs. Retrieval-Augmented Generation. Bioengineering. 2025. #12. P.687.

Nazarov D.M., Badaev F.I. Primenenie bol'shih jazykovyh modelej v sfere zdravoohranenija. Menedzher zdravoohranenija. 2025. #5. P.142-154.

Frolov E.M., Zhabickij M.G., Mokshin K.Ju. Modul'naja pererabotka normativnyh tekstov kak metod povyshenija relevantnosti otvetov bol'shoj jazykovoj modeli: pilotnoe issledovanie na primere klinicheskih rekomendacij po arterial'noj gipertenzii. International Journal of Open Information Technologies. 2025. #8(13). P.94-104.

Ministerstvo zdravoohranenija Rossijskoj Federacii. Rubrikator klinicheskih rekomendacij. Allergicheskij rinit. URL: https://cr.minzdrav.gov.ru/preview-cr/261_2 (data obrashhenija: 05.02.2026)

Mamedov V., Kosarev E., Leleytner G. et al. GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics. 2025. #3.

Sber Developer. Opisanie modelej GigaChat. URL: https://developers.sber.ru/docs/ru/gigachat/models/gigachat-2-lite (data obrashhenija: 05.02.2026)

Bobrova E.V., Zajcev K.S., Sviridenko D.K. Issledovanie rannego i pozdnego kollapsa jazykovyh modelej v medicinskih prilozhenijah. International Journal of Open Information Technologies. 2025. #8(13). S.51-59.

Ray P.P. Benchmarking, ethical alignment, and evaluation framework for conversational AI: Advancing responsible development of ChatGPT. BenchCouncil Transactions on Benchmarks, Standards and Evaluations. 2023. #3(3). P.1-19.

Lin C. Y. ROUGE: A package for automatic evaluation of summaries. Text summarization branches out. Conference: In Proceedings of the Workshop on Text Summarization Branches Out. 2004. P.74-81.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность Monetec 2026 СНЭ

ISSN: 2307-8162