Tag
This paper presents a sentiment analysis and spam detection system for Arabic tweets using the MARBERT model, trained on a dataset of 24,513 tweets to improve customer service for Saudi Telecom Company.
This paper presents a methodology for digitizing the Al-Mawrid Arabic-English dictionary using ISO LMF and TEI Lex-0 standards, achieving high parsing accuracy and precision, and addressing gaps in Arabic lexical infrastructure.
This paper presents an overview of the QIAS 2026 shared task on Islamic inheritance reasoning, evaluating LLMs on multi-step legal and numerical reasoning using the MAWARITH benchmark.
This paper presents MentalMARBERT, a domain-adapted Arabic language model for detecting mental health disorders from social media text. The framework uses domain-adaptive pre-training and a two-stage fine-tuning approach, achieving 0.877 accuracy and 0.861 macro-F1 on a newly constructed Arabic mental health dataset of 50,670 tweets.
Introduces Cohesion-6K, a manually and ChatGPT-assisted annotated dataset of 6,000 Arabic Facebook posts about the Israeli Occupation of Palestine, spanning conflict to cohesion categories. Analysis shows conflict-oriented posts receive 2-4x more engagement than resolution-oriented ones.
This paper presents the Arabic Women and Society Corpus, a ten-year collection of over 250,000 Arabic Facebook posts related to women's empowerment and social wellbeing, with engagement metrics for analyzing gender discourse and sentiment.
A comprehensive overview of twenty years of Arabic NLP research, discussing lessons, failures, and open problems in the field.
This paper presents a framework for Arabic financial sentiment analysis using LLMs, tailored for the Saudi market, integrating news and social media data to capture investor sentiment.
Researchers release SAHM, the first Arabic financial benchmark with 14,380 expert-verified instances covering Shari’ah-compliant reasoning, showing large performance gaps for 20 evaluated LLMs.
QIMMA is a new quality-first Arabic LLM leaderboard introduced by TII UAE that validates benchmarks before evaluation to ensure accurate performance measurement. It addresses systematic quality issues in existing Arabic NLP benchmarks through a rigorous multi-stage validation pipeline.
This paper presents Qatar University's multi-stage QLoRA fine-tuning approach on Qwen3-4B for Arabic Islamic inheritance reasoning, achieving 90% MIR-E score through domain adaptation on Islamic fatwa records followed by task-specific training on 12,000 structured inheritance cases, matching commercial systems like Gemini-2.5-flash with minimal computational resources.
This paper introduces the first parallel Arabic cultural QA benchmark spanning Modern Standard Arabic and multiple dialects, converting multiple-choice questions to open-ended formats and evaluating LLMs with chain-of-thought reasoning to address gaps in culturally grounded and dialect-specific knowledge.