Audience Engagement with Arabic Women's Social Empowerment and Wellbeing: A Decadal Corpus

arXiv cs.CL Papers

Summary

This paper presents the Arabic Women and Society Corpus, a ten-year collection of over 250,000 Arabic Facebook posts related to women's empowerment and social wellbeing, with engagement metrics for analyzing gender discourse and sentiment.

arXiv:2605.22204v1 Announce Type: new Abstract: This paper presents the Arabic Women and Society Corpus, a ten year collection of 252,487 public Arabic Facebook posts related to women's empowerment and social wellbeing. The corpus was collected from 51,660 pages across 77 countries between 2013 and 2024, resulting in more than 267 million user interactions. Each post includes engagement metrics such as shares, comments, and emotional reactions, providing a unique view of audience sentiment and social attention. The data were processed using an automated pipeline with language identification, normalization, and metadata cleaning to ensure reliability and reproducibility. The corpus enables large scale analysis of gender discourse, social reform, and emotional engagement across Arabic dialects. It supports research in Arabic natural language processing, computational social science, and digital communication studies. The dataset and accompanying documentation will be released under request for research use.
Original Article
View Cached Full Text

Cached at: 05/22/26, 08:45 AM

# Audience Engagement with Arabic Women's Social Empowerment and Wellbeing: A Decadal Corpus
Source: [https://arxiv.org/abs/2605.22204](https://arxiv.org/abs/2605.22204)
[View PDF](https://arxiv.org/pdf/2605.22204)

> Abstract:This paper presents the Arabic Women and Society Corpus, a ten year collection of 252,487 public Arabic Facebook posts related to women's empowerment and social wellbeing\. The corpus was collected from 51,660 pages across 77 countries between 2013 and 2024, resulting in more than 267 million user interactions\. Each post includes engagement metrics such as shares, comments, and emotional reactions, providing a unique view of audience sentiment and social attention\. The data were processed using an automated pipeline with language identification, normalization, and metadata cleaning to ensure reliability and reproducibility\. The corpus enables large scale analysis of gender discourse, social reform, and emotional engagement across Arabic dialects\. It supports research in Arabic natural language processing, computational social science, and digital communication studies\. The dataset and accompanying documentation will be released under request for research use\.

## Submission history

From: Wajdi Zaghouani \[[view email](https://arxiv.org/show-email/4fdede53/2605.22204)\] **\[v1\]**Thu, 21 May 2026 09:10:09 UTC \(427 KB\)

Similar Articles