CBRS: Cognitive Blood Request System with Bilingual Dataset and Dual-Layer Filtering for Multi-Platform Social Streams
Summary
Researchers from Bangladesh University of Engineering and Technology present CBRS, a multi-platform framework that filters and parses blood donation requests from social media using a dual-layer architecture and a novel 11K bilingual dataset in Bengali and English. Their LoRA fine-tuned Llama-3.2-3B model achieves 99% filtering accuracy and 92% zero-shot parsing accuracy, outperforming GPT-4o-mini and other LLMs with 35× reduced token usage.
View Cached Full Text
Cached at: 04/21/26, 07:03 AM
# CBRS: Cognitive Blood Request System with Bilingual Dataset and Dual-Layer Filtering for Multi-Platform Social Streams
Source: [https://arxiv.org/html/2604.16665](https://arxiv.org/html/2604.16665)
Anik Saha1∗Mst\. Fahmida Sultana Naznin1∗Zia Ul Hassan Abdullah1 Anisa Binte Asad1K\. G\. Subarno Bithi1A\. B\. M\. Alim Al Islam1 1Bangladesh University of Engineering and Technology, Dhaka, Bangladesh aaniksahaa\.2001@gmail\.com, nazninfahmidasultana@gmail\.com, 2005037@ugrad\.cse\.buet\.ac\.bd anisabinteasad134@gmail\.com, 2011013@mme\.buet\.ac\.bd, alim\_razi@cse\.buet\.ac\.bd ∗These authors contributed equally and are listed alphabetically
###### Abstract
Urgent blood donation seeking posts and mesages on social media often go unnoticed due to the overwhelming volume of daily communications\. Traditional app\-based systems, reliant on manual input, struggle to reach users in low\-resource settings, delaying critical responses\. To address this, we introduce the Cognitive Blood Request System \(CBRS\), a multi\-platform framework that efficiently filters and parses blood donation requests from social media streams using a cost\-efficient dual\-layered architecture\. To do so, we curate a novel dataset of 11K parsed blood donation request messages in Bengali, English, and transliterated Bengali, capturing the linguistic diversity of real social media communications\. The inclusion of adversarial negatives further enhances the robustness of our model\. CBRS achieves an impressive 99% accuracy and precision in filtering, surpassing benchmark methods\. In the parsing task, our LoRA finetuned Llama\-3\.2\-3B model achieves 92% zero\-shot accuracy surpassing the base model by 41\.54% and exceeding the few\-shot performance of GPT\-4o\-mini, gemini\-2\.0\-flash and other LLMs while resulting in a 35×\\timesreduction in input token usage\. This work lays a robust foundation for scalable, inclusive information extraction in time\-sensitive, object\-focused tasks\. Our code, dataset, and trained models are publicly available at[https://github\.com/aaniksahaa/CBRS](https://github.com/aaniksahaa/CBRS)\.
CBRS: Cognitive Blood Request System with Bilingual Dataset and Dual\-Layer Filtering for Multi\-Platform Social Streams
Anik Saha1∗Mst\. Fahmida Sultana Naznin1∗Zia Ul Hassan Abdullah1Anisa Binte Asad1K\. G\. Subarno Bithi1A\. B\. M\. Alim Al Islam11Bangladesh University of Engineering and Technology, Dhaka, Bangladeshaaniksahaa\.2001@gmail\.com, nazninfahmidasultana@gmail\.com, 2005037@ugrad\.cse\.buet\.ac\.bdanisabinteasad134@gmail\.com, 2011013@mme\.buet\.ac\.bd, alim\_razi@cse\.buet\.ac\.bd∗These authors contributed equally and are listed alphabetically\.
## 1Introduction
In the digital era, social networking sites \(SNSs\) have fueled the rapid growth of online communities, with millions of posts shared dailyAuxieret al\.\([2021](https://arxiv.org/html/2604.16665#bib.bib1)\)\. Amid emergencies, users increasingly rely on these platforms to broadcast urgent blood donation needs, seeking to connect with potential donorsAlanzi and Alsaeed \([2019](https://arxiv.org/html/2604.16665#bib.bib3)\)\. However, without efficient automated systems, such posts often remain buried within users’ immediate social circles, limiting their reachMathuret al\.\([2018](https://arxiv.org/html/2604.16665#bib.bib57)\)\. The unstructured and scattered nature of social media communications poses significant challenges for extracting critical information and efficiently disseminating these requestsAbbasiet al\.\([2018](https://arxiv.org/html/2604.16665#bib.bib2)\); Xuet al\.\([2022](https://arxiv.org/html/2604.16665#bib.bib5)\)\.
Figure 1:Bilingual parsing methodology from Bengali\-English\-Transliterated Bengali blood request corporaA key limitation in filtering and parsing such messages in a multilingual setting lies in the limited availability of datasets for low\-resource languages such as Bengali\. Most state\-of\-the\-art Natural Language Processing architectures rely on large\-scale annotated corpora, which are scarcely available for low\-resource languagesPeters and others \([2019](https://arxiv.org/html/2604.16665#bib.bib75)\)\. These languages often feature complex morphosyntactic structures, diverse dialectal variations, and unique linguistic phenomena as shown in Figure[1](https://arxiv.org/html/2604.16665#S1.F1)that are underrepresented in existing multilingual pre\-trained models, limiting effective generalization and transfer learningPeters and others \([2019](https://arxiv.org/html/2604.16665#bib.bib75)\)\. Although there are data sets for the classification of disaster and emergency requestsMathuret al\.\([2018](https://arxiv.org/html/2604.16665#bib.bib57)\); Alamet al\.\([2021](https://arxiv.org/html/2604.16665#bib.bib58)\), none specifically include Bengali or transliterated Bengali\. To our knowledge, we introduce the first bilingual data set comprising requests for blood donation in English, Bengali, and transliterated Bengali\. Figure[2](https://arxiv.org/html/2604.16665#S1.F2)shows the wordcloud of our dataset\.
Developing a reliable solution for accurate detection and effective dissemination of emergency blood donation requests to potential donors poses several critical challenges\. Firstly, the volume of incoming messages and social media posts is often overwhelming, but only a small fraction of these messages represent actual blood donation requests\. Furthermore, in case of classifying such requests, false negatives are much more detrimental than false positives, since the former implies ignoring an urgent request while the latter only adds a little more load to subsequent processing\. Although there is existing work on this disaster and emergency related message classificationLe \([2022](https://arxiv.org/html/2604.16665#bib.bib94)\); Powerset al\.\([2023](https://arxiv.org/html/2604.16665#bib.bib95)\); Shukhman and Shukhman \([2022](https://arxiv.org/html/2604.16665#bib.bib96)\), they often overlook this asymmetric nature of the problem\. Secondly, merely detecting whether a message is asking for blood donation is insufficient to determine which donors to notify to maximize the likelihood of a rapid response\. An automated parsing of such free\-form texts is essential to extract the key information in a structured format\. However, previous studies have focused mainly on detectionChenget al\.\([2024](https://arxiv.org/html/2604.16665#bib.bib97)\); Wanet al\.\([2024](https://arxiv.org/html/2604.16665#bib.bib98)\), leaving a gap in designing efficient and scalable parsing solutions\. Thirdly, for such a system to be viable in real\-world deployment, it must balance speed and accuracy, which present conflicting design constraints\. For instance, using a naively trained lightweight Machine Learning \(ML\) model for the classification leaves a chance of a higher number of false negatives, while using a large language model \(LLM\) entirely for this classification task will not be scalable due to high inference times and costs given the volume of incoming data\.
To address these challenges, we propose a cost\-efficient dual\-layered filtering architecture to identify blood donation requests from large message pools effectively, coupled with a cost\-efficient LLM for rapid and accurate parsing of free\-form text requests into a predefined structured format\. Our key contributions are as follows:
- •We present a novel parsed bilingual dataset consisting of 11K Bengali\-English\-Transliterated Bengali blood donation requests sourced from social media\. This dataset is further enriched with curated adversarial negatives and fragments from publicly available datasets\.
- •We present the Cognitive Blood Request System \(CBRS\), which integrates a cost\-efficient dual\-layered filtering architecture designed to efficiently detect blood donation requests taking into account the asymmetric class weighting\.
- •We train a LoRA finetuned Llama\-3\.2\-3B model for parsing and compare its performance with other open and closed\-weight LLMs in zero\-shot and few\-shot settings\.
- •We benchmark CBRS against existing filtering and parsing methods in terms of both performance and computational complexity\. In a separate human evaluation study across 30 active Telegram and Discord groups with diverse demographics, we assess the real\-world effectiveness of our approach and identify the key factors influencing user satisfaction\.
\(a\)Bengali
\(b\)English
\(c\)Transliterated Bengali
Figure 2:Wordcloud of top keywords in CBRS datasetFigure 3:Data sourcing process of CBRS: positive samples are collected from Facebook, EBDR\-Twitter, and Telegram, followed by cleaning and augmentation with negative samples from BanglaNMT, BanglaTLit, EBDR\-Twitter, Facebook, and curated adversarial samples\.
## 2Related Work
#### Information Extraction from Social Media
Social media is vital for real\-time updates during emergencies, but its unstructured and noisy nature makes extracting actionable insights difficult\. Recent advances in AI and NLP, especially LLMs, offer promising solutions\. Marozzo et al\. used LLMs to classify disaster\-related content by emotion, sentiment, and topic, generating stakeholder\-specific summariesMarozzo \([2025](https://arxiv.org/html/2604.16665#bib.bib99)\)\. He and Hu developed an AI system combining NLP and geospatial visualization for effective monitoringHe and Hu \([2025](https://arxiv.org/html/2604.16665#bib.bib100)\)\. Yin et al\. proposed CrisisSense\-LLM for multi\-label classification of event type, informativeness, and aid relevanceYin and others \([2024](https://arxiv.org/html/2604.16665#bib.bib102)\)\. Shetty et al\. achieved over 91% accuracy using multimodal learning on social media text and imagesShetty and others \([2024](https://arxiv.org/html/2604.16665#bib.bib103)\)\. Hu et al\. introduced a geo\-knowledge\-guided GPT for location extraction, outperforming traditional NER by 40%Huet al\.\([2023](https://arxiv.org/html/2604.16665#bib.bib104)\)\. Alharbi and Haq applied DistilBERT for tweet classification, with 92\.42% training and 82\.11% validation accuracyAlharbi and Haq \([2024](https://arxiv.org/html/2604.16665#bib.bib105)\)\. Mehmood et al\. proposed a three\-step method for classifying relevant posts, extracting locations, and topic modeling with high F1\-scoresMehmoodet al\.\([2024](https://arxiv.org/html/2604.16665#bib.bib106)\)\. However, specific extraction of blood\-related requests remains largely unexplored\.
#### Low\-Resource Language Dataset Curation
Information extraction using LLMs is increasingly applied in disaster response\. However, in low\-resource languages like Bengali, curated and task\-specific datasets are so fewHasanet al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib55)\); Fahimet al\.\([2024](https://arxiv.org/html/2604.16665#bib.bib81)\)that they remain a major bottleneck\. Mathur et al\.Mathuret al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib112)\)proposed a system to identify emergency blood donation requests on Twitter, highlighting the potential of social media mining for critical healthcare interventions\. CrisisBenchDoe and Smith \([2023](https://arxiv.org/html/2604.16665#bib.bib113)\)aggregates past disaster datasets into a unified benchmark for informativeness and urgency prediction\. CrisisMMDAlamet al\.\([2018](https://arxiv.org/html/2604.16665#bib.bib114)\), an early multimodal dataset, integrates text and images from Twitter for disaster classification\. For Bangla, Saha et al\.Sahaet al\.\([2025](https://arxiv.org/html/2604.16665#bib.bib115)\)introduced a disaster dataset covering floods and cyclones, addressing informativeness and urgency in code\-mixed, low\-resource settings\. Bengali\.AIChowdhuryet al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib107)\)and AI4DTeam \([2022](https://arxiv.org/html/2604.16665#bib.bib108)\)contributed handwritten and speech\-text corpora, while BNLPBenchRahmanet al\.\([2023](https://arxiv.org/html/2604.16665#bib.bib109)\)established Bangla benchmarks for NER, sentiment, and classification\. Khandaker et al\.Khandakeret al\.\([2022](https://arxiv.org/html/2604.16665#bib.bib110)\)built a Bangla COVID\-19 rumor dataset, and Roy et al\.Royet al\.\([2022](https://arxiv.org/html/2604.16665#bib.bib111)\)developed BanglaLark, a lightweight BERT model for disaster\-related classification\. These resources aid multilingual crisis AI, though blood related Bengali datasets are still missing\.
## 3Dataset
To overcome the limitations of current Bengali transliteration datasets, our design centers on two key goals: developing a Bengali\-English\-Transliterated Bengali corpus for blood donation requests, and capturing the diverse texting styles in social media groups, including dialectal variations, slang, and abbreviations that helps create a rich understanding of how language evolves in online communication\.
### 3\.1Data Sourcing
We source Bengali, English, and Transliterated Bengali messages from 15 public blood donation groups on Telegram and Facebook\. In total, we present a dataset of 11K parsed emergency blood donation requests as shown in Table[1](https://arxiv.org/html/2604.16665#S3.T1)\.
### 3\.2Data Cleaning
After aggregating the data sources, we conduct extensive deduplication and also detect samples that are not directly associated with blood donation requests\. Certain messages\- such as expressions of willingness to donate \(e\.g\., “I can donate A\- blood in Dhaka\. Please contact me if you’re a recipient”\) or post\-donation acknowledgments \-although structurally similar to positive instances, do not represent actual requests\. We classify these ashard negatives: non\-relevant samples that closely mirror the linguistic and contextual patterns of true positives\. Since these samples may particularly introduce semantic ambiguity, we keep them in the negative portion of the dataset to improve the robustness of the classifier\.
Table 1:Sample distribution across different sourcesCategorySourceTotalSamplesTotalTokensAverageTokensPositiveFacebook63211747772276\.50EBDR\-Twitter394116929042\.96Telegram744139948188\.10Total110062057010–NegativeBengaliNMT319423622073\.96BengaliTLit5000773058154\.61Curated\-Adversarial6002621143\.69Facebook25092262369\.05EBDR\-Twitter585122256838\.04Total148951350319–
### 3\.3Negative Data Augmentation
The dataset includes both positive \(1: blood donation needed\) and negative \(0: not related\) samples, carefully labeled for classification\. We leverage Bengali and English texts from the BengaliNMTHasanet al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib55)\), Bengali and Transliterated Bengali texts from the BengaliTLitFahimet al\.\([2024](https://arxiv.org/html/2604.16665#bib.bib81)\)\. The hard negatives that were manually filtered out in the previous phase are included in the negative portion\. We also include curated adversarial examples containing terms like "blood", "urgent", and "emergency" to enhance robustness\. These adversarial samples are generated with Deepseek\-V3 using the aforementioned hard\-negative samples for few\-shot prompting\. We obtain a portion of negative samples from the EBDR dataset as well\. An overview is provided in Table[1](https://arxiv.org/html/2604.16665#S3.T1)\. Table[2](https://arxiv.org/html/2604.16665#S3.T2)summarizes the Bengali, English, and Transliterated sample distribution across both categories\. Figure[3](https://arxiv.org/html/2604.16665#S1.F3)shows the workflow of data curation\.
Table 2:Sample distribution across different languagesCategoryLanguageTotalSamplesTotalTokensAverageTokensPositiveBengali61631829929296\.92English441219703044\.66Transliterated4313005169\.72NegativeBengali4420893583202\.17English766326433334\.49Transliterated281219240368\.42
## 4Methodology
Messages circulating on SNSs for blood donations are often unstructured which complicates the automation of donor matching based on complex criteria and impeding rapid responses\. Due to social media clustering, these requests typically reach a limited audience, with potential donors frequently overlooking them amidst vast content\. To address these challenges, our methodology incorporates three components: 1\) a cost\-optimized dual\-layered filtering system for detecting blood\-related requests in groups 2\) structured message parsing with Few\-shot prompting 3\) efficient donor notifications using a multi\-platform control system based on geo\-location\.
### 4\.1Proposed Dual Layered Filtering \(DLF\)
#### Layer 1: TF\-IDF\-and\-Asymmetrically Weighted LogReg Classifier
This model is designed to detect blood requests written in Bengali, English, and Transliterated Bengali, handling bilingual and mixed\-language texts, which are critical for the linguistic diversity in our dataset\. The architecture of our model follows a systematic transformation of raw text into a binary classification decision for filtration of blood requests from extensive streams\.
#### Subword Tokenization and Embedding
An input messageM=\(w1,w2,…,wT\)M=\(w\_\{1\},w\_\{2\},\.\.\.,w\_\{T\}\), wherewtw\_\{t\}represents thett\-th word in the message, each wordwtw\_\{t\}is decomposed into subword unitsS\(wt\)=\{s1,s2,…,sN\}S\(w\_\{t\}\)=\\\{s\_\{1\},s\_\{2\},\.\.\.,s\_\{N\}\\\}to capture linguistic variations\. Here,S\(wt\)S\(w\_\{t\}\)denotes the set of subwords corresponding to wordwtw\_\{t\}\. The subwords are mapped to dense embeddings using an embedding matrixEE, whereE∈ℝd×\|V\|E\\in\\mathbb\{R\}^\{d\\times\|V\|\}is a trainable matrix of dimensiondd\(embedding size\) and vocabulary size\|V\|\|V\|:𝐯si=E⋅𝟏si\\mathbf\{v\}\_\{s\_\{i\}\}=E\\cdot\\mathbf\{1\}\_\{s\_\{i\}\}where𝟏si\\mathbf\{1\}\_\{s\_\{i\}\}is the one\-hot encoding of subwordsis\_\{i\}\. The final word embedding is obtained by averaging over its subwords:
𝐯wt=1\|S\(wt\)\|∑s∈S\(wt\)𝐯s\\mathbf\{v\}\_\{w\_\{t\}\}=\\frac\{1\}\{\|S\(w\_\{t\}\)\|\}\\sum\_\{s\\in S\(w\_\{t\}\)\}\\mathbf\{v\}\_\{s\}\(1\)where𝐯s\\mathbf\{v\}\_\{s\}is the embedding vector of subwordss\.
#### Message Representation and Feature Extraction
To derive a fixed\-length message representation, we apply average pooling over all word embeddings:
𝐕=1T∑t=1T𝐯wt\\mathbf\{V\}=\\frac\{1\}\{T\}\\sum\_\{t=1\}^\{T\}\\mathbf\{v\}\_\{w\_\{t\}\}\(2\)whereTTdenotes the total number of words in the message\. This vector𝐕\\mathbf\{V\}is then processed through a fully connected layer for feature extraction:𝐳=W𝐕\+b\\mathbf\{z\}=W\\mathbf\{V\}\+bwhereW∈ℝm×dW\\in\\mathbb\{R\}^\{m\\times d\}is a weight matrix,b∈ℝmb\\in\\mathbb\{R\}^\{m\}is a bias term, and𝐳\\mathbf\{z\}represents the transformed feature vector of dimensionmm\.
#### Binary Classification
Finally, a softmax layer is employed to predict the binary class—whether the message is related to a blood donation request \(y=1y=1\) or not \(y=0y=0\)\. The probability distribution over classes is computed as:
P\(y=c\|𝐳\)=exp\(zc\)∑jexp\(zj\)P\(y=c\|\\mathbf\{z\}\)=\\frac\{\\exp\(z\_\{c\}\)\}\{\\sum\_\{j\}\\exp\(z\_\{j\}\)\}\(3\)wherezcz\_\{c\}is the logit corresponding to classcc\.
To address the high cost of false negatives in emergency blood donation request detection, we adopt a weighted binary cross\-entropy loss that penalizes misclassified positive examples more heavily\. This asymmetry ensures the model prioritizes recall in the first layer\. The loss function is defined as:
ℒ=−αylogP\(y=1∣𝐳\)−\(1−y\)logP\(y=0∣𝐳\)\\mathcal\{L\}=\-\\alpha\\,y\\log P\(y=1\\mid\\mathbf\{z\}\)\-\(1\-y\)\\log P\(y=0\\mid\\mathbf\{z\}\)
wherey∈\{0,1\}y\\in\\\{0,1\\\}is the true label, and we empirically chooseα=12\\alpha=12\.
This asymmetric weighting, however, increases the number of false positives in the first phase\. But since the overall fraction of blood donation requests in a general message pool is usually low, it adds negligible overhead to the subsequent phase\.
Figure 4:Dual\-layered filtering and structured parsing architecture of CBRS, where raw messages undergo tokenization, pooling, and classification, followed by LLM–based filtering and structured parsing\.
#### Layer 2: GPT\-Based Blood Donation Message Classifier
We utilize GPT\-4o\-mini to further filter out non\-blood donation\-related messages and ensure only relevant positive messages are allowedAnet al\.\([2024](https://arxiv.org/html/2604.16665#bib.bib34)\)\. However, this does not introduce any additional cost since this is carried out in the same API call that is used for parsing in the subsequent phase\. Figure[4](https://arxiv.org/html/2604.16665#S4.F4)presents the overall architecture of dual\-layered flitering\.
DLF Layer 1 serves as a lightweight binary classifier that independently filters incoming messages to determine whether they are related to blood request\. This layer is optimized for speed and resource efficiency, ensuring that only relevant messages proceed further in the pipeline\. DLF Layer 2, powered by an LLM, operates as an independent layer that performs secondary classification to reduce false negatives and conducts detailed parsing on messages identified as blood related by the first layer\. By employing this two\-tier architecture, we significantly reduce unnecessary API calls to the LLM, thereby optimizing both cost and performance as shown in Figure[5](https://arxiv.org/html/2604.16665#S4.F5)\.
Figure 5:Two\-layer DLF framework, with Layer 1 identifying blood\-related messages and Layer 2 employing an LLM for detailed filtering and structured parsing\.
### 4\.2Structured Parsing with Few\-shot Prompting
After detecting a message requesting blood donation, we first parse it into a predefined structure to extract key information\. Formally, each message is parsed into a structured objectMpM\_\{p\}with fields such asblood\_group,bags\_needed,patient,condition,location,hospital\_name,location\_markers,probable\_day,probable\_time,contacts, andcompensationensuring that all critical elements are captured for efficient processing\. For the task of parsing, we finetune the LLama\-3\.2\-3B model using LoRA on a split of our parsing dataset\. In case of testing other LLMs, to further enhance parsing precision, we apply the technique of Few\-Shot PromptingReynolds and McDonell \([2021](https://arxiv.org/html/2604.16665#bib.bib35)\)\. In this approach, the model is exposed to a small number of examples, specifically three positive examples and two negative example to guide its predictions\. For a positive example messageMpM\_\{p\}, which is relevant to blood donation, the model is expected to output the parsed information in a structured JSON format\. The output can be shown as:P\(Mp\)=JSON\(properties\)P\(M\_\{p\}\)=\\text\{JSON\(properties\)\}, whereP\(Mp\)P\(M\_\{p\}\)is the parsed JSON output containing the necessary details for blood donation\. For a negative example messageMnM\_\{n\}, which is unrelated to blood donation, the model is expected to flag it as irrelevant\. The output of the model is:P\(Mn\)=FLAGnegativeP\(M\_\{n\}\)=\\text\{FLAG\}\_\{\\text\{negative\}\}whereP\(Mn\)P\(M\_\{n\}\)indicates that the message does not pertain to blood donation\.
P\(Mnew\)=\{JSON\(properties\)if relevantFLAGnegativeif irrelevant\\displaystyle P\(M\_\{\\text\{new\}\}\)=\\begin\{cases\}\\text\{JSON\(properties\)\}&\\text\{if relevant\}\\\\ \\text\{FLAG\}\_\{\\text\{negative\}\}&\\text\{if irrelevant\}\\end\{cases\}
## 5Experimental Setup
### 5\.1Classifier
We train and compare among multiple lightweight machine learning classifiers on text embeddings generated using different methods\. For BERT\-based classifiers such as DistilBERT and MobileBERT, we perform end\-to\-end training using 3 epochs and a batch size of 2\. All embedding generation, training, and evaluation tasks were carried out on a 2×\\timesT4 GPU cluster hosted on Kaggle111[https://www\.kaggle\.com/](https://www.kaggle.com/)\. The models are evaluated based on standard metrics: precision, recall, accuracy, and F1\-score\. The performance comparison of these first\-layer classifiers is reported accordingly\.
### 5\.2Parser
To manage inference costs while maintaining evaluation fidelity, we conduct parsing experiments on a stratified random sample of 958 blood donation request texts\. This subset includes 329 English, 381 Bengali, and 248 transliterated Bengali messages\. First, to generate a gold set of parsed json objects corresponding to the texts, annotations are initially generated using few\-shot prompting with the DeepSeek\-V3 model\. Subsequently, the text\-annotation pairs are distributed among five human annotators to ensure robust evaluation\. Each sample is assigned to three annotators, who independently assess the correctness and provide a binary verdict \(agreement or disagreement\)\. A sample is re\-annotated by human annotators if the majority of the assigned annotators disagree with the initial annotation\. We evaluate a range of LLMs in both zero\-shot and few\-shot settings\. Parsing accuracy is reported using a weighted score, with 20% weight on tree edit distance and the rest on field level accuracy\. To calculate the tree edit distance, we utilize thezsslibrary from Python\. For full\-precision inference with open\-weight models, we utilize the Together AI API222[https://www\.together\.ai/](https://www.together.ai/)and OpenRouter API333[https://openrouter\.ai/](https://openrouter.ai/)based on availability\. For models from OpenAI, we use the official OpenAI API444[https://platform\.openai\.com/](https://platform.openai.com/)\. The LLM decoding parameters for both zero\-shot and few\-shot inference during parsing were: temparature = 0\.7, top\_p = 0\.8, top\_k = 35\.
We also finetune the LLama\-3\.2\-3B model using LoRA\(r=32,α=16r=32,\\alpha=16\) and 4\-bit integer quantization, dropout = 0\.05, batch size=2, epoch=5, learning rate =2×10−42\\times 10^\{\-4\}\. We use a 80:10:10 split for train, test and validation\. 0\.81% of the total 3B parameters are thereby trained on 7\.9K paired text and parsed JSON samples\. We use a 2×\\timesT4 GPU cluster hosted on Kaggle555[https://www\.kaggle\.com/](https://www.kaggle.com/)to carry out the finetuning\.
## 6Results and Discussion
#### DLF Outperforms Other Lightweight classifiers in Accuracy and Efficiency
We compare DLF with a diverse range of embedding and classifier\-based state\-of\-the\-art models for message filtering in Table[3](https://arxiv.org/html/2604.16665#S6.T3)\. We experiment with feature extraction methods, including traditional TFIDFSalton \([1988](https://arxiv.org/html/2604.16665#bib.bib62)\)and CountVectorizer \(Count\)Manning and Schütze \([1999](https://arxiv.org/html/2604.16665#bib.bib63)\), followed by classifiers such as Logistic Regression \(LogReg\)Freund and Schapire \([1999](https://arxiv.org/html/2604.16665#bib.bib64)\), Support Vector Machine \(SVM\)Cortes and Vapnik \([1995](https://arxiv.org/html/2604.16665#bib.bib65)\), Random Forest \(RF\)Breiman \([2001](https://arxiv.org/html/2604.16665#bib.bib66)\), and Naive Bayes \(NB\)McCallum and Nigam \([1998](https://arxiv.org/html/2604.16665#bib.bib67)\)\. Additionally, we test various pre\-trained embeddings with these classifiers, such as Word2Vec \(W2V\)Mikolovet al\.\([2013](https://arxiv.org/html/2604.16665#bib.bib68)\), MiniLM\-L6\-V2 \(MiniLM6\) and MiniLM\-L12\-V2 \(MiniLM12\)Wanget al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib69)\), lightweight transformer models for general\-purpose sentence embeddings, Paraphrase\-MiniLM\-L12\-v2 \(ParaMiniLM\)Reimers and Gurevych \([2019](https://arxiv.org/html/2604.16665#bib.bib70)\), DistilUSEReimers and Gurevych \([2019](https://arxiv.org/html/2604.16665#bib.bib70)\), E5\-SmallWanget al\.\([2022](https://arxiv.org/html/2604.16665#bib.bib71)\), LaBSEFenget al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib72)\), Jina Embeddings\-V2 \(JinaEmb\)AI \([2023a](https://arxiv.org/html/2604.16665#bib.bib73)\), and BAAI General Embeddings \(BGE\)Liet al\.\([2023](https://arxiv.org/html/2604.16665#bib.bib74)\)\. We also explore end\-to\-end training of BERT\-based classifiers, such as, DistilBERTSanhet al\.\([2019](https://arxiv.org/html/2604.16665#bib.bib116)\)and MobileBERTSunet al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib117)\)\. As shown in in Table[3](https://arxiv.org/html/2604.16665#S6.T3), DLF either outperforms or matches the accuracy of other classifiers, while providing the fastest inference\.
Table 3:Comparative performance of filtering methods, where DLF consistently matches or outperforms others across accuracy metrics while achieving the lowest inference time\.EmbeddingClassifierAccuracyPrecisionRecallF1\-ScoreInferenceTime x e\-07\(Seconds\)TFIDFLogReg0\.980\.980\.980\.981\.25SVM0\.980\.980\.980\.982355\.85RF0\.980\.980\.980\.98347\.45NB0\.970\.970\.970\.972\.06CountLogReg0\.980\.980\.980\.981\.19SVM0\.980\.980\.980\.981835\.88RF0\.980\.980\.980\.98326\.89NB0\.960\.960\.960\.961\.82W2VLogReg0\.830\.800\.870\.8110\.87SVM0\.820\.800\.860\.819659\.30RF0\.830\.800\.870\.81156\.89MiniLM6LogReg0\.970\.960\.970\.966\.86SVM0\.970\.970\.970\.973387\.89RF0\.960\.960\.960\.96205\.60MiniLM12LogReg0\.970\.960\.970\.976\.80SVM0\.970\.970\.970\.973028\.26RF0\.970\.960\.970\.96195\.58ParaMiniLMLogReg0\.970\.970\.970\.9717\.86SVM0\.980\.970\.980\.982541\.56RF0\.970\.970\.970\.97187\.06DistilUseLogReg0\.950\.950\.950\.9512\.56SVM0\.960\.960\.960\.966075\.95RF0\.970\.960\.970\.97192\.09E5\-SmallLogReg0\.980\.980\.980\.985\.94SVM0\.980\.980\.980\.981933\.32RF0\.980\.980\.980\.98186\.73LaBSELogReg0\.980\.980\.980\.9811\.40SVM0\.980\.980\.980\.982940\.37RF0\.980\.980\.980\.98203\.78JinaEmbLogReg0\.970\.970\.970\.9719\.26BGELogReg0\.970\.970\.970\.97SVM0\.970\.970\.970\.97RF0\.970\.970\.970\.97DistilBERTDistilBERT0\.980\.980\.980\.98127816\.15MobileBERTMobileBERT0\.980\.980\.970\.97169240\.89DLF0\.990\.990\.980\.981\.10
#### LoRA\-finetuned Lightweight Parser Outperforms Other Language Models
Table[4](https://arxiv.org/html/2604.16665#S6.T4)presents the accuracy score under few\-shot and zero\-shot prompting for various language models, such as, Claude\-3\-haikuAnthropic \([2024](https://arxiv.org/html/2604.16665#bib.bib84)\), Gemini\-2\.0DeepMind \([2024](https://arxiv.org/html/2604.16665#bib.bib85)\), Gemma\-2\-27BGoogle \([2024](https://arxiv.org/html/2604.16665#bib.bib86)\), GPT\-4o\-miniOpenAI \([2024](https://arxiv.org/html/2604.16665#bib.bib87)\), LLaMA\-3\.1\-8BAI \([2024a](https://arxiv.org/html/2604.16665#bib.bib88)\), Meta\-LLaMA\-3\.2\-3BAI \([2024b](https://arxiv.org/html/2604.16665#bib.bib89)\), LLaMA\-3\.3\-70BAI \([2024c](https://arxiv.org/html/2604.16665#bib.bib90)\), Mistral\-7BAI \([2023b](https://arxiv.org/html/2604.16665#bib.bib92)\), Qwen\-2\.5\-7BCloud \([2024](https://arxiv.org/html/2604.16665#bib.bib91)\), and our LoRA finetuned LLama\-3\.2\-3B model\. While few\-shot prompting understandably increases the parsing performance, the finetuned model shows even higher accuracy with zero\-shot prompting\. Our LoRA finetuned model achieves 92% zero\-shot accuracy surpassing the base model’s zero\-shot performance by 41\.54% and exceeding the few\-shot performance of GPT\-4o\-mini, Gemini\-2\.0\-flash and other LLMs\. Claude\-3\-haiku, GPT\-4o\-mini and Gemma\-2\-27B stand out in few\-shot setting with a score of 0\.90 each\. In case of zero\-shot prompting, Gemma\-2\-27B and GPT\-4o\-mini also perform strongly, achieving scores of 0\.89 and 0\.88 respectively\. Notably, Mistral\-7B shows a severe drop from 0\.81 to 0\.02 when we shift to zero\-shot prompting from few\-shot approach, revealing limited generalization\. In contrast, Gemma\-2\-27B, Gemini\-2\.0, and GPT\-4o\-mini demonstrate consistent, balanced performance across both settings\. Although larger models like LLaMA\-3\.3\-70B outperform the smaller variants as expected, the strong parsing accuracy of the finetuned model highlights the efficiency of our lightweight model\. Figure[6](https://arxiv.org/html/2604.16665#S6.F6)presents the parsing accuracy scores across Bengali, English, and Transliterated Bengali for the top six performing model and prompt setting combinations\.
Table 4:Comparison of the parsing accuracy of language models in few\-shot and zero\-shot settings\. Our LoRA\-finetuned Llama\-3\.2\-3B model performs strongly even in zero\-shot setting surpassing the few\-shot performance of other models\.ModelFew\-ShotZero\-ShotClaude\-3\-haiku0\.900\.57Gemini\-2\.00\.880\.87Gemma\-2\-27B0\.900\.89GPT\-4o\-mini0\.900\.88LLaMA\-3\.1\-8B0\.850\.74LLaMA\-3\.2\-3B0\.680\.65LLaMA\-3\.3\-70B0\.880\.87Mistral\-7B0\.810\.02Qwen\-2\.5\-7B0\.830\.78LoRA\-finetuned LLama\-3\.2\-3B\-0\.92
Figure 6:Comparison of parsing accuracy across different languages \- Bengali, English, and Transliterated Bengali with the six highest performing model and prompt setting combinations\.
#### LoRA\-finetuned Lightweight Parser Enables Time\-efficient and Token\-efficient Parsing
We notice that both TFIDF and Count embeddings show the fastest inference time when paired with LogReg and NB classifiers in Table[3](https://arxiv.org/html/2604.16665#S6.T3)\. LogReg and NB consistently show the lowest inference time across most embeddings\. We compare complexity across different models using the CBRS dataset and evaluate them separately for Bengali, English, and Transliterated Bengali, as well as for the total dataset in Table[5](https://arxiv.org/html/2604.16665#S6.T5)\. In this case, our finetuned 3B parameter model significantly reduces the input token usage since we only need to pass the message without any instruction or examples in this case\. On average, it only requires 78\.50 input tokens, in contrast to almost 3K average input tokens for other LLMs\. We find that Gemini\-2\.0 and LLaMA\-3\.2\-3B offer the lowest average cost\. We observe that Gemma\-2\-27B and LLaMA\-3\.3\-70B have the longest inference time\. In contrast, LLaMA\-3\.1\-8B, LLaMA\-3\.2\-3B, and Qwen\-2\.5\-7B perform significantly faster, with inference time ranging from 1\.02 to 2\.27 seconds\. We note that most models exhibit consistent performance across languages, with slightly higher computational cost for Bengali\. It suggests strong robustness across linguistic contexts\. However, Gemma\-2\-27B shows the highest inference time of 5\.42 seconds for Bengali which indicates that Bengali may require more computational effort due to linguistic complexity, a larger vocabulary, or less optimized processing for Bengali\. We also find that English and Transliterated Bengali generally have lower and more consistent inference times\. For most models, the average cost varies only slightly across languages\. Some models show a marginal cost increase for Bengali\. Claude\-3\-Haiku costs 0\.00107 for bn, compared to 0\.00095 for en and 0\.00094 for tbn\. Similarly, LLaMA\-3\.2\-3B costs 0\.00020 for bn, slightly higher than 0\.00017 for en and 0\.00017 for tbn due to its longer inference time for Bengali\.
Table 5:Comparison of message parsing complexity across LLMs in terms of cost, token count, and inference time, showing that our LoRA\-finetuned parser achieves superior efficiency on all metrics\.ModelDataAvgCostAvgInputTokensAvgOutputTokensAvgTotalTokensInferenceTime\(Seconds\)Claude\-3\-Haikubn0\.001072991\.29258\.673249\.962\.89en0\.000952825\.26198\.113023\.372\.49tbn0\.000942818\.61191\.323009\.932\.49total0\.001002889\.91220\.563110\.482\.65Gemini\-2\.0bn0\.000442845\.10388\.463233\.563\.87en0\.000332744\.49137\.382881\.872\.59tbn0\.000332740\.46131\.372871\.832\.50total0\.000372783\.67236\.203019\.873\.07Gemma\-2\-27Bbn0\.002512857\.10279\.113136\.215\.42en0\.002322756\.49137\.572894\.063\.32tbn0\.002302752\.46127\.942880\.402\.95total0\.002392795\.67191\.662987\.334\.06GPT\-4o\-minibn0\.000422284\.57147\.362431\.933\.90en0\.000402221\.49114\.652336\.143\.23tbn0\.000402216\.08106\.262322\.333\.11total0\.000412245\.31125\.552370\.863\.47LLaMA\-3\.2\-3Bbn0\.000202919\.22315\.243234\.463\.36en0\.000172683\.6868\.142751\.831\.49tbn0\.000172663\.6493\.972757\.611\.65total0\.000192772\.65173\.622946\.272\.27LLaMA\-3\.1\-8Bbn0\.000633040\.99481\.063522\.052\.41en0\.000532810\.61120\.552931\.161\.19tbn0\.000532805\.85130\.252936\.091\.02total0\.000572901\.48267\.193168\.671\.63LLaMA\-3\.3\-70Bbn0\.002883042\.21236\.083278\.284\.86en0\.002582812\.09121\.072933\.173\.54tbn0\.002572807\.41110\.112917\.523\.28total0\.002702902\.88164\.213067\.094\.00Mistral\-7Bbn0\.000753524\.80215\.263740\.062\.67en0\.000703313\.11163\.783476\.892\.20tbn0\.000693307\.23163\.343470\.572\.24total0\.000723396\.22184\.253580\.462\.40Qwen\-2\.5\-7Bbn0\.001003113\.89212\.393326\.282\.27en0\.000922913\.79139\.693053\.491\.65tbn0\.000912908\.61125\.063033\.671\.57total0\.000952992\.45164\.973157\.411\.88LoRA finetuned LLama\-3\.2\-3Btotal\-78\.50189\.08267\.581\.35
## 7Error Analysis
Through manual inspection of the predictions from both the classifier and parsing models, we conducted a qualitative error analysis\. Below, we summarize the key sources of errors\.
### 7\.1Classifier Errors
#### Appreciation Messages\.
Public posts on Facebook often contain appreciation messages such as, "We are very grateful to X for donating A\+ blood at location Y\." These posts are not actual blood donation requests, yet the first\-layer classifier occasionally misclassifies them as such\.
#### Edited Messages\.
Messages on Telegram can be edited after posting, which introduces another source of error\. For example, a message such as, "Update: Managed, Emergency blood needed, …" may have originally been a blood donation request but was later marked as already managed\. Such cases also tend to be misclassified by the first layer\.
### 7\.2Parser Errors
#### Distorted Locations\.
Locations mentioned in the messages are often lengthy or appear in disjoint segments, making them difficult to parse accurately\. Although the models were instructed to preserve the original language and structure of the location, they frequently altered it\. This issue contributed to the significant drop in zero\-shot performance of Mistral\-7B, Claude\-3\-Haiku, and Llama\-3\.1\-8B \(Table[4](https://arxiv.org/html/2604.16665#S6.T4)\)\.
#### Structural Deformation\.
Several models, including Mistral\-7B and Claude\-3\-Haiku, struggled to adhere to the required JSON structure in zero\-shot settings\. Common issues included dropping fields, introducing unwanted fields, or modifying the expected field names\.
#### Bengali Messages\.
Parsing Bengali messages posed particular challenges, as also reflected in Figure[6](https://arxiv.org/html/2604.16665#S6.F6)\. Errors included misinterpretation of blood groups written in Bengali, distortion of time and location expressions, and failure to identify valid fields\.
## 8Conclusion
In our study, we present CBRS\-a system combining a curated dataset with a multi\-platform bot for social media groups\. Due to the lack of low\-resource language datasets and the informal nature of online communication, extracting relevant information from large message streams is challenging\. We propose a dual\-layer filtering and parsing architecture for efficient extraction from Bengali, English, and Transliterated Bengali\. This advances object\-based filtering in task\-specific domains and lays the groundwork for intelligent, cross\-platform bots in healthcare\.
## 9Ethical Considerations
This study was conducted in accordance with institutional ethical guidelines\. The collection of data from publicly accessible social media communities on Facebook and Telegram was approved by the Institutional Ethics Review Board\. To protect privacy, all identifying information of the source users from collected posts and messages was anonymized\. In addition, we ensure that, even though the messages and posts in the blood request messages contain patient’s health information, they do not contain patient’s name or any personal identifier\. During the study, all survey participants provided informed consent, and all personal identifiers were removed prior to analysis\.
## 10Limitations
The current study focuses only on Bengali and English, limiting broader multilingual applicability and cross\-regional validation\. Future work will expand to more languages and regions to improve generalizability\. Although CBRS performs well, it faces challenges in scalability, message storage, and spam control, which could overwhelm donors with irrelevant requests\. Future improvements will optimize storage, address spam risks, evaluate performance at larger scales, and expand to more platforms to increase accessibility and usability\.
## References
- R\. A\. Abbasi, O\. Maqbool, M\. Mushtaq, N\. R\. Aljohani, A\. Daud, J\. S\. Alowibdi, and B\. Shahzad \(2018\)Saving lives using social media: analysis of the role of twitter for personal blood donation requests and dissemination\.Telematics and Informatics35\(4\),pp\. 892–912\.External Links:ISSN 0736\-5853,[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.tele.2017.01.010),[Link](https://www.sciencedirect.com/science/article/pii/S0736585316303835)Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p1.1)\.
- Jina embeddings\-v2\.Note:[https://jina\.ai](https://jina.ai/)Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- M\. AI \(2024a\)Meta llama 3\.1 \- 8b\.Note:[https://ai\.meta\.com/llama/](https://ai.meta.com/llama/)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- M\. AI \(2024b\)Meta llama 3\.2 \- 3b\.Note:[https://ai\.meta\.com/llama/](https://ai.meta.com/llama/)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- M\. AI \(2024c\)Meta llama 3\.3 \- 70b\.Note:[https://ai\.meta\.com/llama/](https://ai.meta.com/llama/)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- M\. AI \(2023b\)Mistral 7b\.Note:[https://mistral\.ai/news/introducing\-mistral\-7b/](https://mistral.ai/news/introducing-mistral-7b/)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- F\. Alam, F\. Ofli, and M\. Imran \(2018\)CrisisMMD: multimodal twitter datasets for natural disaster response\.InProceedings of the 12th International AAAI Conference on Web and Social Media \(ICWSM\),pp\. 465–472\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- F\. Alam, H\. Sajjad, M\. Imran, and F\. Ofli \(2021\)CrisisBench: benchmarking crisis\-related social media datasets for humanitarian information processing\.In15th International Conference on Web and Social Media \(ICWSM\),Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p2.1)\.
- T\. Alanzi and B\. Alsaeed \(2019\)Use of social media in the blood donation process in saudi arabia\.Journal of Blood Medicine,pp\. 417–423\.Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p1.1)\.
- K\. Alharbi and M\. A\. Haq \(2024\)Enhancing disaster response and public safety with advanced social media analytics and natural language processing\.Engineering, Technology & Applied Science Research14\(3\),pp\. 14212–14218\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px1.p1.1)\.
- America’s Blood Centers \(2024\)Blood donation statistics and information guide\.Note:[https://americasblood\.org/statistics\_guide/](https://americasblood.org/statistics_guide/)Accessed: 2024\-08\-27Cited by:[§G\.3](https://arxiv.org/html/2604.16665#A7.SS3.p1.1)\.
- K\. An, S\. Si, H\. Hu, H\. Zhao, Y\. Wang, Q\. Guo, and B\. Chang \(2024\)Rethinking semantic parsing for large language models: enhancing llm performance with semantic hints\.arXiv preprint arXiv:2409\.14469\.Cited by:[§4\.1](https://arxiv.org/html/2604.16665#S4.SS1.SSS0.Px5.p1.1)\.
- Anthropic \(2024\)Claude 3 haiku\.Note:[https://www\.anthropic\.com/index/claude\-3](https://www.anthropic.com/index/claude-3)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- B\. Auxier, M\. Anderson,et al\.\(2021\)Social media use in 2021\.Pew Research Center1\(1\),pp\. 1–4\.Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p1.1)\.
- S\. R\. A\. Aziz, A\. A\. Razalan, N\. M\. Noor, and M\. S\. Sauti \(2010\)Proactive notification system using instant messaging bot \(im bot\)\.In2010 International Conference on Science and Social Research \(CSSR 2010\),pp\. 695–698\.Cited by:[§G\.1](https://arxiv.org/html/2604.16665#A7.SS1.p1.1)\.
- L\. Breiman \(2001\)Random forests\.Machine learning45\(1\),pp\. 5–32\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- X\. Cheng, H\. Zhang, J\. Yang, X\. Li, W\. Zhou, F\. Liu, K\. Wu, X\. Guan, T\. Sun, X\. Wu, T\. Li, and Z\. Li \(2024\)XFormParser: a simple and effective multimodal multilingual semi\-structured form parser\.External Links:2405\.17336,[Link](https://arxiv.org/abs/2405.17336)Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p3.1)\.
- T\. R\. Chowdhury, M\. Z\. I\. Rafi, M\. Rahman,et al\.\(2020\)Bengali\.ai handwritten grapheme classification challenge report\.arXiv preprint arXiv:2003\.11239\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- A\. Cloud \(2024\)Qwen 2\.5 \- 7b\.Note:[https://qwen\.readthedocs\.io/](https://qwen.readthedocs.io/)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- C\. Cortes and V\. Vapnik \(1995\)Support\-vector networks\.Machine learning20\(3\),pp\. 273–297\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- G\. DeepMind \(2024\)Gemini 2\.0\.Note:[https://deepmind\.google/technologies/gemini](https://deepmind.google/technologies/gemini)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- J\. Doe and J\. Smith \(2023\)CrisisBench: a benchmark for crisis\-related social media classification\.InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing \(EMNLP\),Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- M\. Fahim, F\. Shifat, F\. Haider, D\. Barua, M\. Sourove, M\. Ishmam, and M\. Bhuiyan \(2024\)BanglaTLit: a benchmark dataset for back\-transliteration of romanized bangla\.InFindings of the Association for Computational Linguistics: EMNLP 2024,pp\. 14656–14672\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1),[§3\.3](https://arxiv.org/html/2604.16665#S3.SS3.p1.1)\.
- Y\. Feng, Z\. Xie, Y\. Xie, R\. Zhan, X\. Li, Z\. Zhang, Y\. Wang, Y\. Liu, and H\. Li \(2020\)LaBSE: language\-agnostic bert sentence embeddings\.arXiv preprint arXiv:2007\.01852\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- Y\. Freund and R\. E\. Schapire \(1999\)Decision trees and decision rules\.Machine Learning37\(1\),pp\. 53–66\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- Google \(2024\)Gemma 2 \- 27b\.Note:[https://ai\.google\.dev/gemma](https://ai.google.dev/gemma)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- T\. Hasan, A\. Bhattacharjee, K\. Samin, M\. Hasan, M\. Basak, M\. S\. Rahman, and R\. Shahriyar \(2020\)Not low\-resource anymore: aligner ensembling, batch filtering, and new datasets for Bengali\-English machine translation\.InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing \(EMNLP\),B\. Webber, T\. Cohn, Y\. He, and Y\. Liu \(Eds\.\),Online,pp\. 2612–2623\.External Links:[Link](https://aclanthology.org/2020.emnlp-main.207/),[Document](https://dx.doi.org/10.18653/v1/2020.emnlp-main.207)Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1),[§3\.3](https://arxiv.org/html/2604.16665#S3.SS3.p1.1)\.
- C\. He and D\. Hu \(2025\)Social media analytics for disaster response: classification and geospatial visualization framework\.Applied Sciences15\(8\),pp\. 4330\.External Links:[Document](https://dx.doi.org/10.3390/app15084330),[Link](https://www.mdpi.com/2076-3417/15/8/4330)Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px1.p1.1)\.
- Y\. Hu, G\. Mai, C\. Cundy, K\. Choi, N\. Lao, W\. Liu, G\. Lakhanpal, R\. Z\. Zhou, and K\. Joseph \(2023\)Geo\-knowledge\-guided gpt models improve the extraction of location descriptions from disaster\-related social media messages\.arXiv preprint arXiv:2310\.09340\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px1.p1.1)\.
- C\. Huang and G\. He \(2024\)Text clustering as classification with llms\.arXiv preprint arXiv:2410\.00927\.Cited by:[§G\.2](https://arxiv.org/html/2604.16665#A7.SS2.p1.1)\.
- A\. Joshi, C\. Nagarjun, and R\. Srinivas \(2017\)The drasb—disaster response and surveillance bot\.In2017 Second International Conference on Electrical, Computer and Communication Technologies \(ICECCT\),pp\. 1–8\.Cited by:[§G\.1](https://arxiv.org/html/2604.16665#A7.SS1.p1.1)\.
- M\. T\. Khandaker, M\. M\. Islam, and F\. Karim \(2022\)Combating covid\-19 rumors in bengali: a low\-resource language dataset and analysis\.InProceedings of the LREC 2022 Workshop on Emergency and Pandemic Situations,Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- Khoros \(2024\)Social media demographics guide\.Note:Accessed: 2024\-08\-27External Links:[Link](https://khoros.com/resources/social-media-demographics-guide#:%CB%9C:text=Social%20media%20usage%20by%20age&text=The%20next%20closest%20age%20group,with%20only%2036.9%20million%20users.)Cited by:[Table 12](https://arxiv.org/html/2604.16665#A7.T12)\.
- J\. Kolluri, V\. K\. Kotte, M\. Phridviraj, and S\. Razia \(2020\)Reducing overfitting problem in machine learning using novel l1/4 regularization method\.In2020 4th international conference on trends in electronics and informatics \(ICOEI\)\(48184\),pp\. 934–938\.Cited by:[§F\.2](https://arxiv.org/html/2604.16665#A6.SS2.p1.1)\.
- V\. Kruglyk, D\. Bukreiev, P\. Chornyi, E\. Kupchak, and A\. Sender \(2020\)Discord platform as an online learning environment for emergencies\.Ukrainian Journal of Educational Studies and Information Technology8\(2\),pp\. 13–28\.Cited by:[§G\.3](https://arxiv.org/html/2604.16665#A7.SS3.p2.1)\.
- A\. D\. Le \(2022\)Disaster tweets classification using bert\-based language model\.External Links:2202\.00795,[Link](https://arxiv.org/abs/2202.00795)Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p3.1)\.
- C\. Li, J\. Zhang, H\. Wang, and X\. Yang \(2023\)BGE: a new family of multilingual and general\-purpose embeddings\.arXiv preprint arXiv:2301\.11191\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- B\. R\. Lindsay \(2011\)Social media and disasters: current uses, future options, and policy considerations\.Congressional Research Service Washington, DC\.Cited by:[§G\.2](https://arxiv.org/html/2604.16665#A7.SS2.p1.1)\.
- C\. D\. Manning and H\. Schütze \(1999\)Foundations of statistical natural language processing\.MIT Press\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- F\. Marozzo \(2025\)Multi\-stakeholder disaster insights from social media using large language models\.arXiv preprint arXiv:2504\.00046\.External Links:[Link](https://arxiv.org/abs/2504.00046)Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px1.p1.1)\.
- P\. Mathur, M\. Ayyar, S\. Chopra, S\. Shahid, L\. Mehnaz, and R\. R\. Shah \(2020\)Identification of emergency blood donation request on twitter\.InProceedings of the Conference,Netaji Subhas Institute of Technology, IIIT\-Delhi, MSIT\-Delhi, DTU\-Delhi\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- P\. Mathur, M\. Ayyar, S\. Chopra, S\. Shahid, L\. Mehnaz, and R\. Shah \(2018\)Identification of emergency blood donation request on Twitter\.InProceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task,G\. Gonzalez\-Hernandez, D\. Weissenbacher, A\. Sarker, and M\. Paul \(Eds\.\),Brussels, Belgium,pp\. 27–31\.External Links:[Link](https://aclanthology.org/W18-5907/),[Document](https://dx.doi.org/10.18653/v1/W18-5907)Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p1.1),[§1](https://arxiv.org/html/2604.16665#S1.p2.1)\.
- A\. McCallum and K\. Nigam \(1998\)A comparison of event models for naive bayes text classification\.In AAAI\-98 Workshop on Learning for Text Categorization\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- A\. Mehmood, M\. T\. Zamir, M\. A\. Ayub, N\. Ahmad, and K\. Ahmad \(2024\)A named entity recognition and topic modeling\-based solution for locating and better assessment of natural disasters in social media\.arXiv preprint arXiv:2405\.00903\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px1.p1.1)\.
- T\. Mikolov, K\. Chen, G\. Corrado, and J\. Dean \(2013\)Efficient estimation of word representations in vector space\.arXiv preprint arXiv:1301\.3781\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- OpenAI \(2024\)GPT\-4o mini\.Note:[https://openai\.com/index/gpt\-4o](https://openai.com/index/gpt-4o)Accessed: 2025\-04\-13Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px2.p1.1)\.
- M\. E\. Peterset al\.\(2019\)Tune: a toolkit for learning to train and evaluate natural language understanding models\.arXiv preprint arXiv:1905\.03843\.Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p2.1)\.
- C\. J\. Powers, A\. Devaraj, K\. Ashqeen, A\. Dontula, A\. Joshi, J\. Shenoy, and D\. Murthy \(2023\)Using artificial intelligence to identify emergency messages on social media during a natural disaster: a deep learning approach\.International Journal of Information Management Data Insights3\(1\),pp\. 100164\.External Links:ISSN 2667\-0968,[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.jjimei.2023.100164),[Link](https://www.sciencedirect.com/science/article/pii/S2667096823000113)Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p3.1)\.
- N\. Rahman, N\. Sultana, and M\. K\. Azad \(2023\)BNLPBench: a benchmark for evaluating bengali natural language processing\.Journal of Computational Linguistics and Applications\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- S\. M\. Rauch and K\. Schanz \(2013\)Advancing racism with facebook: frequency and purpose of facebook use and the acceptance of prejudiced and egalitarian messages\.Computers in Human Behavior29\(3\),pp\. 610–615\.Cited by:[§G\.3](https://arxiv.org/html/2604.16665#A7.SS3.p2.1)\.
- N\. Reimers and I\. Gurevych \(2019\)Sentence\-bert: sentence embeddings using siamese bert\-networks\.arXiv preprint arXiv:1908\.10084\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- L\. Reynolds and K\. McDonell \(2021\)Prompt programming for large language models: beyond the few\-shot paradigm\.InExtended abstracts of the 2021 CHI conference on human factors in computing systems,pp\. 1–7\.Cited by:[§4\.2](https://arxiv.org/html/2604.16665#S4.SS2.p1.7)\.
- N\. Roy, T\. Hossain, and N\. I\. Alam \(2022\)BanglaLark: a lightweight transformer for low\-resource bangla classification\.InProceedings of the International Conference on Asian Language Processing \(IALP\),Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- A\. Saha, N\. Rahman, and Md\. H\. Chowdhury \(2025\)BanglaDisaster: a low\-resource dataset for cyclone and flood event classification in bangla\.InProceedings of the 2025 Conference on Language Resources and Evaluation \(LREC\),Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- G\. Salton \(1988\)Term\-weighting approaches in automatic text retrieval\.Information Processing & Management24\(5\),pp\. 513–523\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- V\. Sanh, L\. Debut, J\. Chaumond, and T\. Wolf \(2019\)DistilBERT, a distilled version of bert: smaller, faster, cheaper and lighter\.ArXivabs/1910\.01108\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- S\. Santhanam, T\. Hecking, A\. Schreiber, and S\. Wagner \(2022\)Bots in software engineering: a systematic mapping study\.PeerJ Computer Science8,pp\. e866\.Cited by:[§G\.4](https://arxiv.org/html/2604.16665#A7.SS4.p1.1)\.
- K\. Shah, H\. Patel, D\. Sanghvi, and M\. Shah \(2020\)A comparative analysis of logistic regression, random forest and knn models for the text classification\.Augmented Human Research5\(1\),pp\. 12\.Cited by:[§F\.2](https://arxiv.org/html/2604.16665#A6.SS2.p1.1)\.
- A\. Shettyet al\.\(2024\)Disaster informatics with multimodal deep learning: a middle fusion approach for social media analysis\.IEEE Transactions on Multimedia26,pp\. 1234–1245\.Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px1.p1.1)\.
- A\. Shukhman and E\. Shukhman \(2022\)Applying machine learning algorithms to automatically classify emergency messages\.InAdvances in Artificial Systems for Medicine and Education V,Z\. Hu, S\. Petoukhov, and M\. He \(Eds\.\),Cham,pp\. 152–160\.External Links:ISBN 978\-3\-030\-92537\-6Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p3.1)\.
- Z\. Sun, H\. Yu, X\. Song, R\. Liu, Y\. Yang, and D\. Zhou \(2020\)MobileBERT: a compact task\-agnostic BERT for resource\-limited devices\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,D\. Jurafsky, J\. Chai, N\. Schluter, and J\. Tetreault \(Eds\.\),Online,pp\. 2158–2170\.External Links:[Link](https://aclanthology.org/2020.acl-main.195/),[Document](https://dx.doi.org/10.18653/v1/2020.acl-main.195)Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- B\. A\. R\. Team \(2022\)Bangla speech and text corpora under the african ai4d program\.Note:https://ai4d\.aiCited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px2.p1.1)\.
- J\. Wan, S\. Song, W\. Yu, Y\. Liu, W\. Cheng, F\. Huang, X\. Bai, C\. Yao, and Z\. Yang \(2024\)OmniParser: a unified framework for text spotting, key information extraction and table recognition\.External Links:2403\.19128,[Link](https://arxiv.org/abs/2403.19128)Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p3.1)\.
- L\. Wang, H\. Zhang, Z\. Yu, H\. Wang, and X\. Zeng \(2022\)E5: unified embedding learning with fast pre\-trained encoder\-decoder transformers\.arXiv preprint arXiv:2202\.05261\.Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- W\. Wang, A\. Sanyal, M\. Kewalramani, P\. He, and H\. Li \(2020\)MiniLM: deep self\-attention distillation for task\-agnostic compression of pre\-trained transformers\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,Cited by:[§6](https://arxiv.org/html/2604.16665#S6.SS0.SSS0.Px1.p1.1)\.
- A\. Whiting and D\. Williams \(2013\)Why people use social media: a uses and gratifications approach\.Qualitative market research: an international journal16\(4\),pp\. 362–369\.Cited by:[§G\.2](https://arxiv.org/html/2604.16665#A7.SS2.p1.1)\.
- B\. Xu, S\. Huang, M\. Du, H\. Wang, H\. Song, C\. Sha, and Y\. Xiao \(2022\)Different data, different modalities\! reinforced data splitting for effective multimodal information extraction from social media posts\.InProceedings of the 29th international conference on computational linguistics,pp\. 1855–1864\.Cited by:[§1](https://arxiv.org/html/2604.16665#S1.p1.1)\.
- R\. Yacouby and D\. Axman \(2020\)Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models\.InProceedings of the first workshop on evaluation and comparison of NLP systems,pp\. 79–91\.Cited by:[§E\.1](https://arxiv.org/html/2604.16665#A5.SS1.SSS0.Px2.p1.1)\.
- Z\. Yinet al\.\(2024\)CrisisSense\-llm: multi\-label classification of disaster\-related social media posts using instruction\-tuned large language models\.InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing \(EMNLP\),Cited by:[§2](https://arxiv.org/html/2604.16665#S2.SS0.SSS0.Px1.p1.1)\.
- A\. R\. Yinka and N\. N\. Queendarline \(2018\)Telegram as a social media tool for teaching and learning in tertiary institutions\.International Journal of Multidisciplinary Research and Development5\(7\),pp\. 95–98\.Cited by:[§G\.3](https://arxiv.org/html/2604.16665#A7.SS3.p2.1)\.
- V\. Zambon \(2020\)Understanding and managing digital burnout\.Medical News Today\.Note:Medically reviewed by Alana Biggers, M\.D\., MPHExternal Links:[Link](https://www.medicalnewstoday.com/articles/digital-burnout)Cited by:[§G\.3](https://arxiv.org/html/2604.16665#A7.SS3.p1.1)\.
Figure 7:This figure illustrates the overall workflow of CBRS\. After filtering and parsing through DLF, a notification is sent to potential donors\. Additionally, it outlines an integrated strategy for seamless donor engagement\.## Appendix APrompts
The details architecture of the system is shown in Figure[7](https://arxiv.org/html/2604.16665#A0.F7)\. The prompt used for parsing the free\-form text messages into structured JSON objects is given in Figure[8](https://arxiv.org/html/2604.16665#A1.F8)\. The prompt used to curate adversarial negative samples is given in Figure[9](https://arxiv.org/html/2604.16665#A1.F9)\.
Few\-shot Prompt for Blood Donation Request ParsingSYSTEM:You will be provided with a message, typically sent by an individual or organization, which may pertain to a request for blood donation\. Your task is to determine whether the message is a blood donation request, and if yes, then to extract the necessary information\.Instructions:•If the message isnota blood donation request, respond with:`\{"is\_blood\_donation\_request": false\}`\. No other fields are required\.•If itisa request, extract relevant information into a well\-structured JSON object strictly conforming to the schema\.•Set fields to`""`if not stated explicitly in the message\.Schema:•blood\_group: one of \[A\+, A\-, B\+, B\-, O\+, O\-, AB\+, AB\-\] or`""`•bags\_needed: string \(e\.g\.,`"3"`or`"3\-4"`\)•patient: \{name, gender \[M/F/""\], age\_group \[child/teenager/young/adult/""\]•condition: comma\-separated medical conditions or status•location,hospital\_name: as stated•location\_markers: list of city/region tokens•probable\_day: one of \[DD/MM,DD/MM/YYYY,today,tomorrow,n days later\]•probable\_time: one of \[HH:MM,before HH:MM,after HH:MM,HH:MM\-HH:MM,in n hours\] \(24\-hr format\)•contacts: list of \{name, contact\_numbers \[…\], relation\_with\_patient\}•compensation: \{transportation: \[Y/N/""\], allowance: \[Y/N/""\]\}Examples:•<positive\_example\>•<negative\_example\>Final Query:Text Message:\{user\_text\}Instruction:Output only the valid JSON response\. No explanations, greetings, or hallucinations\.Figure 8:Few\-shot prompt for blood donation request parsing\.Few\-shot Prompt Template for Adversarial Negative Sample GenerationSYSTEM:You are tasked with generating adversarial examples for a text classification model designed to identify blood donation\-seeking messages\. The goal is to create realistic, diverse, and tricky negative examples that arenotactual blood donation requests, but use vocabulary commonly associated with such requests\.Vocabulary:•Bengali Words:<bengali\_words\>•English Words:<english\_words\>Anchor Examples:Use the provided<negative\_examples\>as inspiration\. You must create similarly styled yet novel adversarial examples that closely mimic the linguistic pattern of real blood donation messages\.Output Format:Generate\{num\_examples\}new examples in JSON format\. Each example must be a JSON object with the following keys:•"en": The message in English•"bn": The equivalent Bengali translation•"tbn": The transliterated Bengali text in Latin scriptReturn the output as a single JSON array of these objects\. All strings must be enclosed in double quotes and follow correct JSON syntax\.Guidelines:•Realistic and Diverse: Make the messages resemble real\-world posts \(e\.g\., social media, chat, awareness campaigns\) but ensure they are not genuine blood donation requests\.•Tricky: Use several keywords from the provided lists in each example to make it deceptively similar to an actual request\.•Not Genuine Requests: The core content must not represent a legitimate need for blood donation\.Reminders:•Strictly follow the JSON schema\.•Donotinclude any greetings, comments, or explanations in the output\.•All examples must bedistinct,novel, andrealistic\.Final Query:Generate <num\_examples\> adversarial examples in the specified JSON format\.Figure 9:Few\-shot prompt for generating adversarial negative samples for blood donation message classification\.
## Appendix BDetailed Workflow of CBRS
#### System Integration Flow
The bots are initially integrated into social groups with the explicit consent of both users and administrators\. The bot serves two purposes\. Firstly, it encourages group members to register as donors by providing a direct link to the registration inbox\. Secondly, it polls for messages in the group continuously and looks for ones that seek blood donations\.
#### Donor Enrollment Process
When users engage with our bot via direct messaging to register as donors, they are redirected to a centralized registration web application\. This interface systematically collects a data setD=\{blood\_group,current\_location,last\_donation\_date\}D=\\\{\\text\{blood\\\_group\},\\text\{current\\\_location\},\\text\{last\\\_donation\\\_date\}\\\}\. We use browser geo\-location to get accurate latitude and longitude, with explicit user consent before data collection\. Upon submission, the datasetDDis stored with the corresponding chat platformIDID, allowing efficient user notifications for future blood donation requests\. The interface allows users to update their information at any time\. This ensures accurate tracking of their last donation date for better record maintenance\. The donor enrollment is a single\-point of design choice made to streamline input across existing multiple platform interfaces\.
#### User Interface
The CBRS interface majorly comprises two components: the chatbot interface and a single point of donor information intake\. Since we plan to employ our bots as members of already running chat groups, the chat interface is essentially the same as the interfaces of those corresponding chat platforms\. For our initial design, we selected two prominent chat platforms named Telegram and Discord\. Both of these platforms feature engaging conversational interfaces which we utilize for our purpose\. Unlike several other platforms, Telegram and Discord share a particular feature, namely, the use of slash user commands in the chat interface\. Since the interactions with our chatbot are limited in possible options, we opt to design convenient user commands rather than parsing natural language messages from users on the fly shown in Table[6](https://arxiv.org/html/2604.16665#A2.T6)\. The currently available user commands in our chat interface are as follows:
Table 6:Bot Commands and their PurposesCommandPurpose/startInitialize interaction with the bot/helpDisplay a user guide/show\_my\_infoShow the registered user details/update\_my\_infoUpdate user information/register\_as\_donorRegister as a blood donor/goodbyeEnd interaction with the bot
To facilitate the input of donor information, we designed a single\-page web application featuring a form that receives the blood group, last donation date, and GPS location from the browser\. To eliminate the need for reiterating a donor’s chat platform identity, we generate a unique URL for the donor based on their user account on the chat platform\. By visiting this unique URL, the donor can update their information directly from the chat interface at any time\.
#### Context\-Aware Notification Strategy
Efficient donor notification introduces non\-trivial challenges\. Firstly, over\- or under\-notification can impair both user experience and system efficiency\. To mitigate this, we adopt an iterative, stage\-wise notification strategy, where donors are queried sequentially\. Upon receiving a positive response, further alerts are suppressed, and the seeker is immediately informed\. The stage\-depth is dynamically governed by the urgency level inferred via the parsing model\. Secondly, post\-hoc message edits\-particularly those indicating successful blood acquisition\-necessitate retroactive updates\. We maintain a notification ledger for each request; upon detecting such edits, prior recipients are promptly notified of the resolution\.
#### Implementation Details
The chatbots are implemented with standard libraries released and maintained by the corresponding chat platform\. For instance, to design the chatbot for Telegram, we use the library python\-telegram\-bot in python and discord\.js library for the Discord bot\. These libraries help us take appropriate actions based on slash commands and user messages\. In case of slash commands, we perform string matching and execute corresponding methods\. On the other hand, for any non\-command text, we first call the filtering API with the text to determine whether it is actually seeking blood donation or not\. If yes, we further call our parsing API to parse the text into a JSON format\. We first perform training on a curated dataset and then carry out inferences\. We train the model for 1000 epochs using a learning rate 1\.0\. We use trigrams \(wordNgrams=3\) to capture better context from word sequences\. Subword length is configured with minn=3 and maxn=6 to handle out\-of\-vocabulary words\. The parsing API is implemented with[Langchain](https://www.langchain.com/)\. As LLM, we use GPT\-4o\-mini with few shot prompting\. The unified donor information intake application is built with[React](https://react.dev/)and these pieces of information are stored in[MongoDB](https://www.mongodb.com/products/platform/atlas-database)under appropriate models
Figure 10:This figure shows demographic distribution by group type\.Figure 11:This figure shows demographic distribution by group size\.Figure 12:This figure shows blood donation related messages numbers in different groups and blood group of usersFigure 13:This figure shows demographic distribution by gender\.Figure 14:This figure shows demographic distribution by age\.Figure 15:This figure shows demographic distribution by education\.Figure 16:This figure shows demographic distribution by Occupation\.
## Appendix CUser Study
To assess the effectiveness of multi\-platform bots for timely message filtering and notifications, we conducted a study on various indicators like response time, user interface, command usability and satisfaction levels\. Participants shared their experiences regarding delays in receiving blood donation requests\. We organized questions into two parts: one for those who frequently share these messages and another for blood donors\. Different demographics were included to ensure a balanced study\.
### C\.1Conditions
We conducted a between\-subject study with two conditions for a predefined period — the baseline social media group without a bot and one integrated with CBRS\. Each condition featured a consistent set of questions designed to gather insights on user experience, response time, engagement levels, and challenges\. The baseline system relied on manual messaging and user coordination; in contrast, the CBRS integration introduced blood donation message filtering, automated responses, real\-time donor matching, and geo\-location\-based notifications\.
### C\.2Participants
We recruited members from 20 active groups on Telegram and 10 active groups on Discord, all based in Bengalidesh\. Figure[10](https://arxiv.org/html/2604.16665#A2.F10)and Figure[11](https://arxiv.org/html/2604.16665#A2.F11)shows the demographic distribution across the groups\. The groups varied in size, ranging from 50 to 8,00 members, and were created for different purposes, including health awareness \(35%\), community service \(25%\), educational resources \(10%\), business networking \(5%\), local events \(10%\), technology discussions \(5%\) and student support \(10%\)\. Furthermore, the volume of messages within these groups ranged from 30 to 110 with 1\-20 as blood donation requests per day\. A total of 114 participants, with 38 potential donors, joined the survey on pre\- and post\-integration of CBRS\. Figure[12](https://arxiv.org/html/2604.16665#A2.F12),[13](https://arxiv.org/html/2604.16665#A2.F13),[14](https://arxiv.org/html/2604.16665#A2.F14),[15](https://arxiv.org/html/2604.16665#A2.F15), and[16](https://arxiv.org/html/2604.16665#A2.F16)shows the demographic breakdown of a diverse group of participants, aged 18 to 57\. Most were between 18\-25 \(34%\) and 26\-33 \(33%\)\. A majority \(80%, 91/114\) had at least a college or bachelor’s degree\. The participants came from various professions: 10% in business, 25% as NGO workers, 10% as doctors, 9% as engineers, 7% in security forces, and 30% were students\.
### C\.3Procedures
At the start of the study, we selected diverse groups based on different targets and ages\. Admins signed consent form from the group\. We recorded the average daily messages and blood\-related messages per group\. Each participant signed a consent form and completed a pre\-study questionnaire to gather demographic information including gender, age, occupation, education level, prior experience with social media groups, and other BDSs regarding blood donation initiatives\. They also shared their initial expectations from CBRS\.
The study was conducted over three days, from October 23 to 26, 2024\. We integrated bots into the groups\. A total of 108 individuals registered as donors from different locations\. The last donation dates and blood groups are stored during registration on bots\. Among the donors, 30% are O\+ and 30% are B\+ blood types, while 1% are O\- and AB\-, indicating a lower proportion of negative blood types as shown in Figure[12](https://arxiv.org/html/2604.16665#A2.F12)\. They did not receive training on the bot to assess the intuitiveness of user interface\. We then began collecting feedback from them\. Participants were asked about their satisfaction levels in areas such as prior problems, satisfaction with slash\-command prompts, user interface, overall functionality, comparisons with other apps, challenges in using the bots, and suggested improvements\. Responses were gathered using a five\-point Likert scale to evaluate their experiences\. Our contributor survey had two types: users who made requests and donors who were notified through bots and donated in the last three days\. Users answered 11 questions, while donors answered 7 questions given in Appendix[D](https://arxiv.org/html/2604.16665#A4)\.
## Appendix DSurvey Questionnaire
We surveyed 114 participants, including 38 potential donors and gathered valuable insights on their satisfaction levels and open feedback regarding challenges and suggestions for improvement\. This provided valuable contributions to our work\. The survey questions are given below:
### For Users:
1. 1\.Do you request blood donations on social media \(e\.g\., Telegram, Discord, etc\.\)? \(Almost always, Often, Sometimes, Seldom, Never\)
2. 2\.Did you usually receive timely responses to your blood donation requests before using BNet prior to October 23, 2024? \(Almost always, Often, Sometimes, Seldom, Never\)
3. 3\.How satisfied are you with the timely response of BNet in identifying potential donors between October 23 and October 26, 2024, after integrating BNet into groups? \(Very satisfied, Satisfied, Neither, Dissatisfied, Very dissatisfied\)
4. 4\.After getting a response from BNet, have you successfully connected with a blood donor through BNet? \(Almost always, Often, Sometimes, Seldom, Never\)
5. 5\.How easy do you find using BNet through slash command prompts? \(Extremely easy, Very easy, Moderately easy, Slightly easy, Not at all\)
6. 6\.How intuitive is the user interface of BNet? \(Extremely intuitive, Very intuitive, Moderately intuitive, Slightly intuitive, Not at all\)
7. 7\.How would you rate the overall functionality of BNet? \(Excellent, Above Average, Average, Below Average, Very Poor\)
8. 8\.At most how many blood donation seeking messages do you feel comfortable to receive from BNet per month? \(1\-5, 6\-10, 11\-15, 16\-20, 21\+\)
9. 9\.Do you find BNet more effective than existing blood donation apps or methods you have used before? \(Much better, Somewhat better, Stayed the same, Somewhat worse, Much worse, Not applicable\- I have never used any app before\)
10. 10\.What challenges do you face in connecting with blood donors? How can these be overcome? \(Open\-ended response\)
11. 11\.What improvements would you suggest to make BNet better for requesters? \(Open\-ended response\)
### For Donors:
1. 1\.How many times have you donated blood in the past year? \(Never, 1 time, 2 times, 3 times, 4 or more\)
2. 2\.Do you have trouble finding blood donation requests among a large volume of messages in social media groups? \(Almost always, Often, Sometimes, Seldom, Never\)
3. 3\.How convenient is BNet in notifying you about blood donation requests in social media groups? \(Extremely convenient, Very convenient, Moderately convenient, Slightly convenient, Not at all\)
4. 4\.How would you rate the overall functionality of BNet? \(Excellent, Above Average, Average, Below Average, Very Poor\)
5. 5\.Do you find BNet more effective than existing blood donation apps or methods you’ve used before? \(Much better, Somewhat better, Stayed the same, Somewhat worse, Much worse, Not applicable\)
6. 6\.What challenges do you face in connecting with blood requesters? How can these be overcome? \(Open\-ended response\)
7. 7\.What improvements would you suggest to make BNet better for donors? \(Open\-ended response\)
## Appendix EData Analysis
To address existing gap of existing BDSs, we ask the following research questions in this work:
- •RQ1:How can a multi\-platform bot be designed to seamlessly integrate with OSNs to accelerate donor response and broaden the donor network?
- •RQ2:How can a cost\-efficient framework be developed to precisely filter blood donation messages from extensive message streams to minimize operational costs?
- •RQ3:How can a bot serve diverse demographic groups for blood donation and ensure that users perceive its integration as convenient across social media groups?
To assess our research questions, we formulated three key hypotheses: H1\- Auto filtering of blood donation messages and geo\-location\-based notifications of CBRS will accelerate the speed of donor response\. H2\- Dual\-layered filtering architecture of CBRS will cost\-effectively filter and parse blood donation messages from extensive social media streams\. H3\- CBRS, as a multi\-platform bot, will serve diverse demographic groups equally and improve convenience across OSNs We first analyzed group messages, blood donation requests per day, and group demographics\. For RQ1, we proposed hypothesis H1\. To validate H1, we tracked donor response times using timestamps at each stage: message dispatch, bot execution, notification delivery, and response received\. For RQ2, we introduced H2, evaluated dual\-layered filtering accuracy with precision, recall, and F1\-score and assessed the cost\-efficiency of this approach\. To answer RQ3, we proposed H3, using both quantitative and qualitative analyses of usage logs, pre\- and post\-study surveys and feedback to assess response quality and satisfaction\. Post\-study Likert\-scale feedback on slash\-command prompts, UI design, and user satisfaction, alongside insights from open\-ended responses helped highlight improvements and challenges\. We demonstrated network growth through multi\-platform integration\. We also applied Pearson’s correlation and Spearman’s rank correlation and to examine associations between indicators of satisfaction index\. A summary of metrics and measures is provided in Table[7](https://arxiv.org/html/2604.16665#A5.T7)\. Additionally, we reviewed user demographics, compared current blood donation apps, and identified areas for improvement\. To minimize message overflow and enhance satisfaction, we explored optimal request frequency, and emphasized security for future implementation\.
### E\.1Metrics and Measurements
To evaluate all hypotheses, we define metrics mentioned in Table[7](https://arxiv.org/html/2604.16665#A5.T7)\.
Table 7:Performance metrics for the evaluation of CBRS\.HypothesisMetricExplanationMetric SystemH1Timely ResponseMeasurement of the elapsed time between message arrival,request parsing, donor notification, and donor responseTimestampingH2Filtering AccuracyIdentifying and parsing blood donation requests froma large pool of messagesPrecisionF1\-scoreRecallCost EfficiencyThe financial implications of filtering and parsing blooddonation requests from a large pool of messagesPricing ModelH3Command UsabilityPerception of ease of use of the slash command promptLikert scale responseIntuitivenessPerception of user\-friendly interfaceSatisfaction IndexOverall user satisfaction in functionality and performance of the bot
#### Timely Response:
We define Timely Response as the measurement of the time taken from when a message arrives to when it is parsed, a donor is notified and a response is received\. We assess it in two ways\. First, we use timestamps to track the time from the arrival of the message to the response in each stage\. The second method involves measuring satisfaction with the timely response of CBRS in identifying potential donors between October 23 and October 26, 2024, after its integration into groups\. Participants respond on a scale from 1 to 5: 5 indicates "Very satisfied," 4 means "Satisfied," 3 means "Neither," 2 means "Dissatisfied," and 1 represents "Very dissatisfied\."
#### Filtering Accuracy:
We define filtering Accuracy as the ability to identify and extract blood donation requests from a vast array of messages\. This assessment incorporates key performance indicators: precision, F1\-score, and recall to ensure robust evaluationYacouby and Axman \([2020](https://arxiv.org/html/2604.16665#bib.bib53)\)\.
#### Cost Efficiency:
We measure Cost Efficiency as the financial impact of filtering and processing blood donation requests from a large volume of messages\. This involves a pricing model that evaluates various message volumes across different groups and analyzes expenses based on existing cost structures \(see[Open AI cost link](https://openai.com/api/pricing/)\)\. We then compare these costs to the expenses associated solely with parsing blood\-related messages\.
#### Command Usability:
We define Command Usability as the ease with which users can utilize slash commands\. To assess this, we posed the question: "How easy do you find using CBRS through slash\-command prompts \(e\.g\., /start, /show\_my\_info, etc\.\)?" Responses are rated on a scale from 1 to 5, where 5 signifies "Extremely easy," 4 indicates "Very easy," 3 denotes "Moderately easy," 2 represents "Slightly easy," and 1 means "Not at all easy\."
#### Intuitiveness:
We measure Intuitiveness as the perception of the user interface in relation to usability, color scheme, layout, and overall aesthetic appeal\. To evaluate this, we ask, "How intuitive do you find the user interface of CBRS?" Respondents rate their experience on a scale from 1 to 5, where 5 represents "Extremely intuitive," 4 indicates "Very intuitive," 3 signifies "Moderately intuitive," 2 denotes "Slightly intuitive," and 1 means "Not at all intuitive\."
#### Satisfaction Index:
We ask both donors and requesters to evaluate the overall functionality of CBRS with the question, "How would you rate the overall functionality of CBRS?" Responses are rated on a scale from 1 to 5, where 5 represents "Excellent," 4 indicates "Above Average," 3 signifies "Average," 2 reflects "Below Average," and 1 denotes "Very Poor\."
Additionally, we gather user feedback regarding challenges and potential improvements\. To assess performance, we inquire, "Do you find CBRS more effective than the blood donation apps or methods you have previously used?" This comparison provides valuable insights into CBRS’s effectiveness in enhancing the blood donation experience\.
## Appendix FFindings
In this section, we present our key findings regarding response time, operational cost efficiency, and user convenience in detail\. Our analyses reflect the significant reduction in the parsing and retrieval time after deploying CBRS\. The dual\-layered filtering architecture helps CBRS maintain adequate accuracy while reducing the parsing cost\. Our survey results also indicate that CBRS can be helpful for both donors and recipients across diverse demographic groups\.
#### User Evaluation
We conduct a user assessment to evaluate CBRS\. User demographics and procedures appear in Appendix[C](https://arxiv.org/html/2604.16665#A3)\. We recruit members from 20 active Telegram groups and 10 active Discord groups, all based in Bangladesh\. User experience\-related questions appear in Appendix[D](https://arxiv.org/html/2604.16665#A4)\. We consider six metrics: Timely Response, Filtering Accuracy, Cost Efficiency, Command Usability, Intuitiveness, and Satisfaction Index, with descriptions provided in Appendix[E](https://arxiv.org/html/2604.16665#A5)\. We inquire about the overall functionality of CBRS\. Notably, 44% of respondents find command usability to be "very easy," and another 44% describe the user interface as "very intuitive\." Additionally, 61% rate the overall functionality as "above average," with 28% considering it "excellent\." On social media, 21% of donors report "always" having trouble finding blood donation requests amid a high volume of messages\. Another 32% experience this "often," and 32% encounter it "sometimes\." After receiving notifications through CBRS, 39% of donors find the process "very convenient\." Figure[17](https://arxiv.org/html/2604.16665#A6.F17)shows the results from the survey\. We also conduct a Spearman’s rank correlation to assess the relationship between user satisfaction and CBRS functionality metrics as shown in Table[8](https://arxiv.org/html/2604.16665#A6.T8)\. Command Usability and Intuitiveness show strong positive correlations of 0\.52 and 0\.54, respectively, with the latter being statistically significant, highlighting the importance of interface design\. Timely Notification shows a moderate correlation, while Timely Response shows no significant association\.
Figure 17:This figure presents the results of a user study conducted through survey questionnaires\.Table 8:Result of Spearman Correlation and p\-valuesMetricSpearman Correlationp\-valueTimely Response0\.060\.80Command Usability0\.520\.02Intuitiveness0\.540\.01Timely Notification0\.430\.08
### F\.1H1 Results: Auto\-filtering of blood donation messages and geo\-location notifications speeds up donor response by reducing parsing and retrieval time
We track four specific timestamps from message arrival to donor response\. First, we log the time the message arrives in the group\. Next, we record when the parsed blood donation request is stored in our database; the difference between these two timestamps indicates the time taken to parse the message\. Third, we log when a notification is sent to the first matching donor; the time between the second and third timestamps represents the retrieval and matching process\. Finally, we capture the first affirmative response from a donor, with the time between the third and fourth timestamps indicating donor response time\. As shown in Table[9](https://arxiv.org/html/2604.16665#A6.T9), we observe notable differences at each stage, from message arrival to donor response\. Parsing time averaged 4 seconds with a low variability of 0\.45 seconds\. Retrieval time averaged 5 seconds with a standard deviation of 3 seconds\. The most significant finding was Response time, averaging 81 minutes with a high variability of 110 minutes\.
Table 9:Performance of time tracking in each stage of CBRS from arrival to responseTaskAverageTimeStandardDeviationParsing Time4s0\.45sRetrieval Time5s3sResponse Time81min110min
### F\.2H2 Results: The dual\-layered filtering architecture of CBRS efficiently filters and parses blood donation messages from large social media streams, delivering high accuracy at a lower cost
The Layer 1 of the model model was evaluated on the test set, achieving an overall accuracy of 98\.7%\. In the classification of messages, non\-blood\-related messages are denoted as 0 and blood\-related messages as 1\. As shown in Table[10](https://arxiv.org/html/2604.16665#A6.T10), for class 0, the model attained a precision of 99%, with a recall at 99%, resulting in an F1\-score of 0\.99\. For class 1, precision remained at 99%, while recall was 98%, yielding an F1\-score of 0\.99 as well\. Overall, the macro and weighted averages for precision, recall, and F1\-score are all 0\.99\. We also experiment with Logistic Regression using TF\-IDF vectorizationShahet al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib32)\)\. This approach transforms text messages into numerical features by capturing the frequency of unigrams and bigrams\. We use L2 regularization to prevent overfittingKolluriet al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib33)\)\. We compare this with the DLF model\. DLF proves to be the better approach due to its superior handling of bilingual and mixed\-language texts\.
Table 10:Classification report of Layer 1 frameworkClassPrecisionRecallF1\-ScoreSupport00\.990\.990\.9924910\.990\.980\.99276Macro Avg0\.990\.990\.99525Weighted Avg0\.990\.990\.99525
We, furthermore, analyzed the cost efficiency of single\-layered filtering \(using only GPT\-4\-0\-mini\) compared to dual\-layered filtering \(using the CBRS architecture\) as shown in Table[11](https://arxiv.org/html/2604.16665#A6.T11)\. We first recorded the daily message volume from our observed groups per day\. Next, we logged the number of blood donation requests identified by CBRS\. Using GPT\-4\-0\-mini at a rate of $0\.0003 per message, direct processing costs would be $0\.0045, $0\.0165, and $0\.0285 per day for average message counts of 15, 55, and 95, respectively\. In the initial layer, CBRS filters messages with 98\.7% accuracy, isolating blood donation requests with average counts of 1, 3, and 5 per day\. These filtered messages then proceed to the second layer, where GPT\-4o\-mini performs validation and parsing at a cost of $0\.003, $0\.009, and $0\.015, respectively\. Overall, this dual\-layered architecture reduces costs by approximately 33\.33% to 47\.37%, depending on message volume\.
Table 11:Cost analysis of dual\-layered filteringRangeAverageMessagesBloodMessagesAverageCostAverage Costof Messages0–30151$0\.0045$0\.000340–70553$0\.0165$0\.000980–110955$0\.0285$0\.0015
### F\.3H3 Results: CBRS will serve diverse demographic groups equally and improve convenience across OSNs through timely notifications, timely responses, command usability, intuitiveness
Our survey shows diverse demographics concerning gender, age, education, and occupation as shown in Figure[13](https://arxiv.org/html/2604.16665#A2.F13),[14](https://arxiv.org/html/2604.16665#A2.F14),[15](https://arxiv.org/html/2604.16665#A2.F15), and[16](https://arxiv.org/html/2604.16665#A2.F16)\. Among the participants, 65% identified as male and 35% as female\. All age groups were represented, with individuals aged 18\-33 showing the highest interest in blood donation\. Males are more inclined to donate blood than females\. Among different professionals, students constituted 30% of respondents while NGO workers made up 25% ranking second\. Notably, 80% of the participants had college or undergraduate education\. Among these groups, 11% "almost always" make donation requests, 39% "seldom" request and 22% "sometimes" request\. However, only 11% reported receiving timely responses "always"\. After integrating CBRS, 67% of users were "satisfied" with the timely responses from CBRS\.
We first examined the correlations among four metrics—Timely Response, Command Usability, Intuitiveness, and Satisfaction Index—using Pearson’s correlation analysis to explore their interrelationships as shown in Figure[18](https://arxiv.org/html/2604.16665#A6.F18)\. Notably, Command Usability and Intuitiveness show a high correlation coefficient of 0\.60\. Additionally, there is a moderate positive correlation of 0\.54 between Intuitiveness and the Satisfaction Index\. Command Usability and the Satisfaction Index exhibit a positive correlation of 0\.50\. However, Timely Response did not significantly correlate with the other metrics, particularly with the Satisfaction Index \(\-0\.01\) and Command Usability \(\-0\.05\)\.
However, only 17% of users reported being "almost always" receiving blood, while 44% experienced "sometimes", and 11% never connected even after receiving responses from donors through CBRS\. When asked about the challenges they faced while connecting, one donor P33 highlighted, “I encountered communication and transport issues even if I responded to donate”\. P54 expressed, “My family did not allow to donate blood to individuals I did not know”\. Additionally, P60 remarked, “The location of donation requests was unclear”\. We inquired about existing blood donation apps or methods that participants had used previously\. Notably, 50% stated that CBRS is "much better", while 39% described it as "somewhat better"\. When asked for suggestions for improvement, P51 remarked, “It would be beneficial to incorporate CBRS into other social media platforms for wider accessibility”\. Another participant, P42, suggested, “A dedicated dashboard displaying donation requests would be helpful for users”\.
Figure 18:This figure shows Pearson Correlation Heatmap of User Feedback Metrics
## Appendix GDiscussion
In this study, we designed and developed a multi\-platform bot to engage users and efficiently screen large volumes of messages\. To keep filtering cost\-effective, we implemented a dual\-layered filtering architecture\. Our evaluation with 114 users showed improved response times and engagement and provided more effective support to existing BDSs\. Automated filtering and notifications enabled faster responses, while multi\-platform integration created a versatile donor network\. Convenient slash commands and an intuitive interface made it easy for participants to use\. This section discusses the implications of our findings and provides design recommendations\.
### G\.1Creating a fast response system architecture for a multi\-platform bot
There exists a delicate balance between success and failure in urgency managementJoshiet al\.\([2017](https://arxiv.org/html/2604.16665#bib.bib40)\)\. Timely and precise execution during emergencies significantly enhances the likelihood of successful outcomes and mitigates potential risksJoshiet al\.\([2017](https://arxiv.org/html/2604.16665#bib.bib40)\)\. Auto\-notification system is necessary for system administrators in this regardAzizet al\.\([2010](https://arxiv.org/html/2604.16665#bib.bib41)\)\. It provides real\-time updates on system status and enables prompt responses to issues by maintaining optimal operational efficiencyAzizet al\.\([2010](https://arxiv.org/html/2604.16665#bib.bib41)\)\.
Our findings demonstrate that auto\-filtering of blood donation messages and geo\-location notifications significantly accelerates donor responses\. The average parsing time is just 4 seconds with minimal variability\. This rapid message filtering enables us to sift through large pools of social media messages and quickly identify relevant donation requests\. The retrieval time averages 5 seconds, streamlining the matching process\. These data are mentioned in Table[9](https://arxiv.org/html/2604.16665#A6.T9)\. This ensures potential donors are connected to appropriate requests without delay\. Auto\-notifications use the Haversine distance algorithm to alert nearby donors based on their geographical location\. The importance of these features cannot be overstated\. This targeted approach minimizes the time and effort needed to locate suitable donations\. Response time is a crucial factor we observe\. While it averages 81 minutes and shows high variability, it can be significantly influenced by the efficiency of previous steps\. Quick parsing and retrieval times help minimize overall delays and facilitates faster connections between donors and recipients\.
Before implementing CBRS, we asked users if they received timely responses to their blood donation requests\. Many reported they did not get timely replies in Figure[17](https://arxiv.org/html/2604.16665#A6.F17)\. However, after using CBRS, users expressed satisfaction with the promptness of the responses\. When we inquired about difficulties in locating blood donation requests among a large pool of messages, results from our study showed that most struggled to find them from a broad range of messages\. In contrast, a significant majority found CBRS effective in notifying them about donation requests within social media groups\. Our analysis also revealed that even though donors responded, many were unable to complete the donation\. When we asked open\-ended questions on this issue, we noted that most donors cited unclear donation location addresses as a primary challenge\. We also observed that some donors expressed concerns about donating to unfamiliar recipients\. It underscores a potential need for member authentication\. We plan to broaden our research to tackle these concerns\. We invite researchers from HCI to collaborate on finding alternative solutions for these challenges\.
### G\.2Designing cost\-optimized dual\-layered filtering for free multi\-platform Use
80% of people utilize social media for interactions with friends, family, spouses, co\-workers, old acquaintances, and new friendsWhiting and Williams \([2013](https://arxiv.org/html/2604.16665#bib.bib36)\)\. Additionally, 76% turn to these platforms to pass the time, often during idle moments or when seeking entertainmentWhiting and Williams \([2013](https://arxiv.org/html/2604.16665#bib.bib36)\)\. Importantly, social media has assumed an increasingly vital role in emergenciesLindsay \([2011](https://arxiv.org/html/2604.16665#bib.bib37)\), ranking as the fourth most popular source for accessing emergency informationLindsay \([2011](https://arxiv.org/html/2604.16665#bib.bib37)\)\. Recent advancements in state\-of\-the\-art Large Language Models \(LLMs\), such as the GPT series, have showcased exceptional reasoning capabilities across various tasks, including message filtering and parsingHuang and He \([2024](https://arxiv.org/html/2604.16665#bib.bib38)\)\. However, the continuous deployment of LLMs on large message pools can lead to significant operational costs\. As illustrated in Figure[11](https://arxiv.org/html/2604.16665#A2.F11), group message volume varies with group size, emphasizing the necessity for primary filtering to effectively manage this substantial influx of messages\. Ensuring high accuracy during the filtering process is paramount\. To strike a balance between cost\-effectiveness and accuracy, we implemented a dual\-layered structure, resulting in high precision, F1 scores, and recall rates\. This approach not only enhances the efficiency of message processing but also minimizes resource expenditure\. Our dataset was meticulously curated to provide a balanced representation of positive and negative messages across various languages\. It addresses both class imbalance and linguistic diversity over a wide array of topics beyond emergency contexts\. The versatility of this dataset is crucial for training models that can generalize effectively in real\-world scenarios\. Notably, our calculations in Table[11](https://arxiv.org/html/2604.16665#A6.T11)indicate that primary filtering can reduce bot operation costs by up to 47%, as only blood donation requests advance to GPT\-4\-mini for further processing\. This significant cost savings underscores the importance of efficient filtering mechanisms in optimizing the functionality of LLMs in social media applications\. Integrating a dual\-layered filtering approach can greatly enhance the management of large volumes of messages on social media platforms, particularly in emergency contexts\. This framework not only ensures timely and accurate responses but also demonstrates the potential for significant cost reduction, paving the way for more efficient use of advanced language processing technologies\.
### G\.3Developing a multi\-platform solution for diverse demographics
Association for the Advancement of Blood & Biotherapies \(AABB\) reports that the average blood donor is typically college\-educated and aged 30–50 yearsZambon \([2020](https://arxiv.org/html/2604.16665#bib.bib42)\)\. Younger adults, particularly those aged 18\-25, are increasingly likely to donate bloodAmerica’s Blood Centers \([2024](https://arxiv.org/html/2604.16665#bib.bib43)\)\. While males have historically been more frequent donors than femalesZambon \([2020](https://arxiv.org/html/2604.16665#bib.bib42)\), this gap is narrowing as more females become regular donors\. Additionally, white individuals tend to donate at higher rates compared to Black, Hispanic, and Asian populationsAmerica’s Blood Centers \([2024](https://arxiv.org/html/2604.16665#bib.bib43)\)\. Blood donors from higher socioeconomic backgrounds are more likely to donate, often due to better access to healthcare facilities and donation centersZambon \([2020](https://arxiv.org/html/2604.16665#bib.bib42)\)\. Table[12](https://arxiv.org/html/2604.16665#A7.T12)illustrates that the age group most likely to donate blood is also among the most active on social media platforms\. This demographic overlap is significant, as males are slightly more active on social media \(53\.4%\) compared to females \(46\.6%\)\. The similarities in demographics between blood donors and social media users indicate that social media can be a powerful tool for identifying potential donors across various demographic groups\. Integrating bots on social media platforms can effectively trace and engage potential donors from all demographic categories\.
Our result shows that using a multi\-platform approach greatly broadens the donor network by engaging diverse demographics across popular platforms\. Each platform offers unique strengths\. Telegram is ideal for exchanging messages, sharing media and files, and supporting private or group callsYinka and Queendarline \([2018](https://arxiv.org/html/2604.16665#bib.bib44)\)\. Facebook focuses on connecting communitiesRauch and Schanz \([2013](https://arxiv.org/html/2604.16665#bib.bib46)\)\. It is effective for creating and maintaining support groups that foster awareness and keep people updated on ongoing donation needsRauch and Schanz \([2013](https://arxiv.org/html/2604.16665#bib.bib46)\)\. Discord, initially popular for gaming, allows for real\-time text, voice, and video communication in community\-centered "servers"Kruglyket al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib45)\)\. This feature helps reach younger, tech\-savvy usersKruglyket al\.\([2020](https://arxiv.org/html/2604.16665#bib.bib45)\)\. Our survey highlighted that not all blood types are equally available\. Rare types like O\- and AB\- are often harder to find\. Limiting the donor search to a single platform would risk missing donors who frequently use other social spaces\. By adopting a multi\-platform strategy, we increase the probability of reaching donors with diverse blood types and availability\. Our survey results also confirmed that a multi\-platform approach increases the donor pool\.
When asked for feedback on areas for improvement, participants suggested extending CBRS to other social media channels such as WhatsApp and Facebook\. This aligns with our future research plans to integrate more platforms and ensure wider coverage\.
Table 12:Social media users by different age groupsKhoros \([2024](https://arxiv.org/html/2604.16665#bib.bib54)\)AgeGroupAgeRangeSocial MediaUsersGen Z11–2656\.4MGen X43–5851\.8MBaby Boomers59–7736\.9M
### G\.4Exploring slash command prompt and user Interface design for multi\-platform bot
The user interface of bots are often referred to as the "universal UI" due to their flexibility and ease of use across multiple platformsSanthanamet al\.\([2022](https://arxiv.org/html/2604.16665#bib.bib39)\)\. Integrating command prompt mechanisms into these systems has tremendously enhanced their utilitySanthanamet al\.\([2022](https://arxiv.org/html/2604.16665#bib.bib39)\)\. This enhancement facilitates quicker task completion and reduces the need for extensive documentationSanthanamet al\.\([2022](https://arxiv.org/html/2604.16665#bib.bib39)\)\. In our findings, we explored how these design choices influenced overall user satisfaction within CBRS\. We asked users about their perception of command usability and the user interface of the bot shown in Figure[17](https://arxiv.org/html/2604.16665#A6.F17)\. Most reported satisfaction with the performance of CBRS\. Our analysis demonstrated that command usability and interface intuitiveness play a pivotal role in fostering a positive user experience\. Pearson’s correlation analysis revealed strong relationships between Command Usability, Intuitiveness, and Satisfaction Index which indicates that an intuitive, accessible interface is crucial for user engagement\. Spearman’s rank correlation further confirmed these insights by showing a consistently positive relationship between command usability and user satisfaction\. As command usability improved, satisfaction levels also increased and intuitive design elements in the interface had a significant statistical impact\. This analysis shows that a quality user interface and easy command access are key for a smooth user experience on CBRS\.
In our design phase, we selected Telegram and Discord due to their engaging conversational interfaces and shared features of slash commands\. We explored these platforms to enhance interaction efficiency\. By opting for a structured command interface rather than real\-time natural language parsing, we aimed to reduce miscommunication and increase response speed\. This design choice made the bot more accessible and user\-friendly, ultimately resulting in higher satisfaction levels\. To further streamline the experience, we developed a single\-page web application that simplifies donor data entry\. This application captures essential information such as blood group, last donation date, and GPS location directly from the user’s browser\. We observed that this addition, combined with unique URLs linked to users’ chat platform identities, allowed donors to update their details effortlessly without needing to re\-identify themselves\. These design decisions facilitated seamless information management and had a positive impact on satisfaction, as users could easily update their information from the chat interface\. Each feature we implemented, such as simplified data entry and unique URLs, contributed significantly to user satisfaction by enhancing usability and reducing friction\. We explored how a multi\-platform command prompt and user interface enhance user interactions by ensuring consistent access and intuitive navigation across various platforms\. When we solicited open\-ended feedback regarding potential improvements for CBRS, users highlighted the need for a dashboard displaying donation requests\. This feedback reflects a strong desire for more organized and accessible information\. We plan to delve deeper into this feedback to refine and elevate the user experience\.
## Appendix HExamples of Misclassified and Misparsed Samples
Table[13](https://arxiv.org/html/2604.16665#A8.T13)presents representative failure cases from the classification pipeline, illustrating false positives and false negatives produced by different classifiers\. Tables[14](https://arxiv.org/html/2604.16665#A8.T14),[15](https://arxiv.org/html/2604.16665#A8.T15), and[16](https://arxiv.org/html/2604.16665#A8.T16)present recurring parsing errors including hallucinated fields, date normalization failures, and untranslated output across multiple models and languages\.
Table 13:Representative misclassified examples across different classifiers\.MessageGround TruthPredictedClassifierIts your blood that can save another life\. Dear friends, the aim of this Group is to help ourselves through Facebook when we are in need\. Anybody from anywhere can post his/her blood urgency here and i wish we ourselves will come up with whatever we have…After all, WE LOVE OUR FRIENDS & FAMILY WITH EVERYTHING\.falsetrueWord2Vec \+ SVMApnar blood group ki? Amar blood group A\+\. Apni ki blood dite ichchhuk? Prochur blood\-er request ashteche, location ta janaben\.falsetrueParaMiniLM \+ Logistic RegressionAmar rokter group B\+\. Chokbazar Medical chara je kono sthane rokto donate korte raji achi\. Number: 018XXXXXXX \(<NAME\>\)\. Karo proyojon hole amar ID ebong number take mention kore din\.falsetrueMiniLM12 \+ Random ForestKeo ki \(B\-\) B negative blood donate korte parba\. Monday ratre operation\.truefalseMiniLM12 \+ Random ForestTable 14:Representative misparsed example: hallucinated hospital/location and missing condition field\.FieldContentInput message“Dhaka\-r Shahbage” ekjon operation\-er rogir jonno aj \(14\-06\-21\) shondha 7 tar moddhe 2 bag “O negative” \[O\-ve\] rokter proyojon\. \[14th Floor, Ward\#12\] Jogajog: BARDEM Hospital, phone: 018XXXXXXX \(rogir attio\)\.Parsed fieldsblood\_group:O\-
bags\_needed:2
patient\.name:\-\-\-
patient\.gender:\-\-\-
patient\.age\_group:\-\-\-
condition:operation
location:Dhaka\-r Shahbage
location\_markers:Dhaka\-r Shahbage
hospital\_name:BARDEM Hospital
probable\_day:14/06/2021
probable\_time:before 19:00
contacts\[0\]\.name:\-\-\-
contacts\[0\]\.numbers:018XXXXXXX
contacts\[0\]\.relation:rogir attio
compensation\.transportation:\-\-\-
compensation\.allowance:\-\-\-blood\_group:O\-
bags\_needed:2
patient\.name:\-\-\-
patient\.gender:\-\-\-
patient\.age\-group:\-\-\-
condition:\-\-\-
location:Kidney Foundation & Research Institute, Dhaka
location\_markers:Dhaka
hospital\_name:Kidney Foundation & Research Institute
probable\_day:14/06/21
probable\_time:before 19:00
contacts\[0\]\.name:\-\-\-
contacts\[0\]\.numbers:018XXXXXXX
contacts\[0\]\.relation:\-\-\-
compensation\.transportation:\-\-\-
compensation\.allowance:\-\-\-Model / settingMeta\-Llama\-3\.1\-8B\-Instruct, few\-shot, BengaliError typeHallucinated hospital/location; missing condition; date normalization errorCommentThe model correctly extracts blood group, bag count, time, and contact number, but hallucinates an entirely different hospital and location \(Kidney Foundation & Research Institute, Dhaka instead of BARDEM Hospital, Shahbag\)\. It also fails to extract the condition \(operation\) and omits the donor relation field\. The date is unnormalized \(14/06/21 vs\. 14/06/2021\)\.Table 15:Representative misparsed example: hallucinated probable time and date normalization error\.FieldContentInput message\#Delhi 2 Units AB\-ve \#Blood \#urgent \#need at AIIMS Hospital \#Jun\_21 Pls Call 981XXXXXXX or 724XXXXXXX @TajinderBagga @nistula @upma23 \#BloodMattersParsed fieldsblood\_group:AB\-
bags\_needed:2
patient\.name:\-\-\-
patient\.gender:\-\-\-
patient\.age\_group:\-\-\-
condition:\-\-\-
location:AIIMS Hospital
location\_markers:Delhi
hospital\_name:AIIMS Hospital
probable\_day:21/06
probable\_time:\-\-\-
contacts\[0\]\.name:\-\-\-
contacts\[0\]\.numbers:981XXXXXXX, 724XXXXXXX
contacts\[0\]\.relation:\-\-\-
compensation\.transportation:\-\-\-
compensation\.allowance:\-\-\-blood\_group:AB\-
bags\_needed:2
patient\.name:\-\-\-
patient\.gender:\-\-\-
patient\.age\-group:\-\-\-
condition:\-\-\-
location:AIIMS Hospital
location\_markers:AIIMS Hospital
hospital\_name:AIIMS Hospital
probable\_day:Jun\_21
probable\_time:before 24:00
contacts\[0\]\.name:\-\-\-
contacts\[0\]\.numbers:981XXXXXXX, 724XXXXXXX
contacts\[0\]\.relation:\-\-\-
compensation\.transportation:\-\-\-
compensation\.allowance:\-\-\-Model / settingMeta\-Llama\-3\.2\-3B\-Instruct, few\-shot, EnglishError typeHallucinated probable time; date normalization error; incorrect location markerCommentThe model correctly extracts blood group, bag count, hospital name, and contact numbers, but hallucinates a probable time \(before 24:00\) that has no basis in the input\. It also fails to normalize the date \(Jun\_21 instead of 21/06\) and incorrectly uses the hospital name as the location marker instead of the city \(Delhi\)\.Table 16:Representative misparsed example: incorrect compensation fields, hallucinated time, and untranslated/unnormalized output fields\.FieldContentInput messageAssalamu Alaikum\. Esho kori rokto dan, hashbe rogi, bachbe pran\. Ajkei dorkar, pliz je dite ichchhuk doya kore ektu jogajog korben\. Rogir shomossha: khubi mumurshu obosthay medical\-e vorthi ache \(boyoshko\)\. Rokter group: O\-\. Rokter poriman: onirnito \(ekadhik lagbe\)\. Rokto daner tarikh: joto druto shombhob, 30\-10\-2022\. Rokto daner sthan: Rajshahi Medical College \(alochona kore je kono jagha theke neya jabe\)\. Jogajog: 017XXXXXXX\.Parsed fieldsblood\_group:O\-
bags\_needed:\-\-\-
patient\.name:\-\-\-
patient\.gender:\-\-\-
patient\.age\_group:adult
condition:very critical, admitted in medical
location:Rajshahi Medical College
location\_markers:Rajshahi
hospital\_name:Rajshahi Medical College
probable\_day:30/10/2022
probable\_time:\-\-\-
contacts\[0\]\.name:\-\-\-
contacts\[0\]\.numbers:017XXXXXXX
contacts\[0\]\.relation:\-\-\-
compensation\.transportation:\-\-\-
compensation\.allowance:\-\-\-blood\_group:O\-
bags\_needed:onirnito \(ekadhik lagbe\)
patient\.name:\-\-\-
patient\.gender:M
patient\.age\_group:adult
condition:khubi mumurshu obosthay medical\-e vorthi ache
location:Rajshahi Medical College
location\_markers:Rajshahi
hospital\_name:Rajshahi Medical College
probable\_day:30/10/2022
probable\_time:in as soon as possible
contacts\[0\]\.name:\-\-\-
contacts\[0\]\.numbers:017XXXXXXX
contacts\[0\]\.relation:\-\-\-
compensation\.transportation:N
compensation\.allowance:NModel / settingClaude\-3\-Haiku, zero\-shot, BengaliError typeIncorrect compensation fields; hallucinated probable time; untranslated/unnormalized condition and bags\_needed; hallucinated patient genderCommentThe model correctly extracts blood group, hospital name, location, and date, but makes several errors: it fills compensation fields with “N” instead of leaving them blank \(compensation was not mentioned in the input\); it hallucinates a probable time \(“in as soon as possible”\) and a patient gender \(M\) not stated in the input; it copies the raw Bengali text for condition and bags\_needed rather than normalizing them to English canonical forms\.Similar Articles
Development and Preliminary Evaluation of a Domain-Specific Large Language Model for Tuberculosis Care in South Africa
Researchers fine-tuned BioMistral-7B with QLoRA and GraphRAG to create a TB-care LLM for South Africa, showing improved contextual alignment over the base model.
MultiSoc-4D: A Benchmark for Diagnosing Instruction-Induced Label Collapse in Closed-Set LLM Annotation of Bengali Social Media
This paper introduces MultiSoc-4D, a benchmark for diagnosing instruction-induced label collapse in LLMs annotating Bengali social media. It reveals that LLMs systematically prefer fallback labels, leading to under-detection of minority categories like hate speech and sarcasm.
Assessing Capabilities of Large Language Models in Social Media Analytics: A Multi-task Quest
Researchers from Utah State and Vanderbilt benchmark GPT-4, Gemini 1.5 Pro, DeepSeek-V3, Llama 3.2 and BERT on three social-media tasks—authorship verification, post generation, and user attribute inference—introducing new sampling protocols and taxonomies to reduce bias and enable reproducible benchmarks.
Sharing "cull" : my open-source dataset tool for image scraping & classification & captioning pipeline
Cull is an open-source machine curation engine for AI image datasets that automates scraping, classification, and captioning to prepare data for training LoRAs or fine-tuning models.
PSK@EEUCA 2026: Fine-Tuning Large Language Models with Synthetic Data Augmentation for Multi-Class Toxicity Detection in Gaming Chat
This paper presents a system for the EEUCA 2026 shared task on toxicity detection in gaming chat, achieving 4th place by fine-tuning Llama 3.1 8B with synthetic data augmentation. It highlights a 'validation trap' phenomenon where high validation scores do not correlate with test performance due to dataset distribution shifts.