Spam and Sentiment Detection in Arabic Tweets Using MARBERT Model
Summary
This paper presents a sentiment analysis and spam detection system for Arabic tweets using the MARBERT model, trained on a dataset of 24,513 tweets to improve customer service for Saudi Telecom Company.
View Cached Full Text
Cached at: 06/25/26, 05:12 AM
# Spam and Sentiment Detection in Arabic Tweets Using MARBERT Model Source: [https://arxiv.org/abs/2606.25495](https://arxiv.org/abs/2606.25495) [View PDF](https://arxiv.org/pdf/2606.25495) > Abstract:Saudi Telecom Company \(STC\) is among the most popular companies in Saudi Arabia, with many customers\. Yet, there is still a big room for improvement in users' satisfaction\. Social media is the most robust platform to gauge users' satisfaction and determine their sentiments and critics\. Twitter is among the most popular social media platform in this regard\. STC customers prefer to use Twitter to write their feedback because it's a fast way to get responses due to the STC customer services account\. One way to achieve customer demands and improve customer service is using the Sentiment Analysis tool\. Sentiment Analysis on Twitter is highly used because of the significant number of tweets and the different opinions\. Likewise, Deep learning is the best existing Sentiment Analysis method, and it has diverse models\. Bidirectional Encoder Representations from Transformers \(BERT\) model is one of the deep learning models which have achieved excellent results in Sentiment Analysis for Natural Language Processing \(NLP\)\. NLP is mainly investigated in the English language\. However, for Arabic, there is a significant gap to be filled\. This study trained the proposed model using MARBERT and measured the performance using f1\-score, precision, and recall metrics\. We trained the model with an Arabic dataset of 24,513 tweets, including 1,437 positive, 13,828 negative, 5,694 neutral, 1,221 sarcasm, and 2,297 indeterminate tweets\. The main goal is to analyze the tweets and get the sentiment to improve STC customer service\. The proposed scheme is promising in terms of accuracy in contrast to existing techniques in the literature\. ## Submission history From: Abrar Alotaibi \[[view email](https://arxiv.org/show-email/23359762/2606.25495)\] **\[v1\]**Wed, 24 Jun 2026 07:22:39 UTC \(1,058 KB\)
Similar Articles
MentalMARBERT: Domain-Adaptive Pre-training and Two-Stage Fine-Tuning for Arabic Mental Health Disorders Detection
This paper presents MentalMARBERT, a domain-adapted Arabic language model for detecting mental health disorders from social media text. The framework uses domain-adaptive pre-training and a two-stage fine-tuning approach, achieving 0.877 accuracy and 0.861 macro-F1 on a newly constructed Arabic mental health dataset of 50,670 tweets.
LLM-Based Financial Sentiment Analysis in Arabic: Evidence from Saudi Markets
This paper presents a framework for Arabic financial sentiment analysis using LLMs, tailored for the Saudi market, integrating news and social media data to capture investor sentiment.
Automated Scoring of Arabic Text Using Large Language Models: A Literature Review
A literature review examining LLM-based approaches for automatic scoring of Arabic text, covering short answer grading and essay scoring, with a proposed taxonomy and comparative analysis.
Linear Semantic Segmentation for Low-Resource Spoken Dialects
This paper introduces a benchmark for semantic segmentation in low-resource dialectal Arabic and proposes a model that improves performance on conversational speech compared to standard baselines.
An End-to-End Hybrid Framework for Rumour Detection in Low-Resources Algerian Dialect
This paper presents an end-to-end hybrid framework for rumour detection in low-resource Algerian dialect social media content, achieving an F1-score of 0.84 by combining transformer embeddings with a classical classifier.