Development and Preliminary Evaluation of a Domain-Specific Large Language Model for Tuberculosis Care in South Africa
Summary
Researchers fine-tuned BioMistral-7B with QLoRA and GraphRAG to create a TB-care LLM for South Africa, showing improved contextual alignment over the base model.
View Cached Full Text
Cached at: 04/23/26, 10:02 AM
# Development and Preliminary Evaluation of a Domain-Specific Large Language Model for Tuberculosis Care in South Africa Source: [https://arxiv.org/abs/2604.19776](https://arxiv.org/abs/2604.19776) [View PDF](https://arxiv.org/pdf/2604.19776) > Abstract:Tuberculosis \(TB\) is one of the world's deadliest infectious diseases, and in South Africa, it contributes a significant burden to the country's health care system\. This paper presents an experimental study on the development of a domain\-specific Large Language Model \(DS\-LLM\) for TB care that can help to alleviate the burden on patients and healthcare providers\. To achieve this, a literature review was conducted to understand current LLM development strategies, specifically in the medical domain\. Thereafter, data were collected from South African TB guidelines, selected TB literature, and existing benchmark medical datasets\. We performed LLM fine\-tuning by using the Quantised Low\-Rank Adaptation \(QLoRA\) algorithm on a medical LLM \(BioMistral\-7B\), and also implemented Retrieval\-Augmented Generation using GraphRAG\. The developed DS\-LLM was evaluated against the base BioMistral\-7B model and a general\-purpose LLM using a mix of automated metrics and quantitative ratings\. The results show that the DS\-LLM had better performance compared to the base model in terms of its contextual alignment \(lexical, semantic, and knowledge\) for TB care in South Africa\. ## Submission history From: Olawande Daramola Prof\. \[[view email](https://arxiv.org/show-email/a3ae5965/2604.19776)\] **\[v1\]**Sat, 28 Mar 2026 11:22:05 UTC \(651 KB\)
Similar Articles
@cjzafir: VLMs (Vertical Language Models) are beating top LLMs. These small 7B to 15B niche-focused models are beating SoTA model…
The author demonstrates that small vertical language models (6B-15B) can outperform top LLMs on niche benchmarks through cost-effective fine-tuning using open-source models and Codex orchestration, achieving results with a $300 dataset.
Can Language Models Identify Side Effects of Breast Cancer Radiation Treatments?
This paper presents a deployment-oriented stress-testing framework to evaluate how well large language models identify side effects of breast cancer radiation treatments. The study highlights limitations in LLM reliability, such as sensitivity to minor documentation changes and under-recall of rare side effects, suggesting that grounding outputs in clinician-curated lists improves robustness.
LLiMba: Sardinian on a Single GPU -- Adapting a 3B Language Model to a Vanishing Romance Language
The article introduces LLiMba, a 3B parameter model adapted from Qwen2.5 for Sardinian using continued pretraining and supervised fine-tuning on a single consumer GPU. It evaluates various LoRA configurations, finding that adapter capacity significantly impacts performance and factual accuracy in low-resource language adaptation.
MedAction: Towards Active Multi-turn Clinical Diagnostic LLMs
This paper introduces MedAction, a framework for training LLMs on active, multi-turn clinical diagnosis by simulating iterative test ordering and hypothesis updates. It presents a new dataset, MedAction-32K, and demonstrates state-of-the-art performance for open-source models on medical benchmarks.
Injecting Structured Biomedical Knowledge into Language Models: Continual Pretraining vs. GraphRAG
This paper compares two strategies for injecting structured biomedical knowledge from the UMLS Metathesaurus into language models: continual pretraining (embedding knowledge into model parameters) and GraphRAG (querying a knowledge graph at inference time). Results show improvements on biomedical QA benchmarks, with GraphRAG on LLaMA 3-8B yielding over 3 and 5 accuracy points on PubMedQA and BioASQ respectively without any retraining.