Automatic Part-of-Speech Tagging of Arabic-English Dictionary Senses through WordNet
Summary
This paper proposes a resource-light algorithm to automatically assign part-of-speech tags to senses in the Al-Mawrid Arabic-English bilingual dictionary by transferring tags from English WordNet after disambiguation, achieving high accuracy with minimal cost.
View Cached Full Text
Cached at: 06/24/26, 07:46 AM
# Automatic Part-of-Speech Tagging of Arabic-English Dictionary Senses through WordNet Source: [https://arxiv.org/abs/2606.24359](https://arxiv.org/abs/2606.24359) [View PDF](https://arxiv.org/pdf/2606.24359) > Abstract:This paper proposed an algorithm for part\-of\-speech \(POS\) tagging senses of a bilingual dictionary\. The algorithm is applied on the Al\-Mawrid Arabic\-English dictionary\. The tagging task is accomplished by transferring the POS tags of the English translation equivalences \(TEs\) to the dictionary senses after dis\-ambiguities process\. The English POS tags of senses are acquired from the Princeton WordNet\. POS tagging of bilingual dictionary senses is prerequisite to link a bilingual dictionary to WordNet and/or standardizing that dictionary into WordNet\-LMF format where the synset \(set of synonyms\), not word, is the basic brick\. The registered accuracy is high though the cost is little\. Building NLP/HLT tools needs linguistic experts, large investments, and long time\. For statistical approach, we need large annotated corpora and for rule\-based approach, we need large lexicon that contains rich linguistic and world knowledge\. That motivates the appearance of what are called resource\-light approaches to develop natural language processing \(NLP\) tools for poor\-resource languages\. ## Submission history From: Diaa Fayed \[[view email](https://arxiv.org/show-email/94f85ae8/2606.24359)\] **\[v1\]**Tue, 23 Jun 2026 09:49:26 UTC \(629 KB\)
Similar Articles
Towards Structuring an Arabic-English Machine-Readable Dictionary Using Parsing Expression Grammars
This paper presents a method to structure the Arabic-English Al-Mawrid dictionary using parsing expression grammars, converting entries into hierarchical structures for NLP applications.
Analyzing and Encoding the Al-Mawrid Arabic-English Dictionary with the ISO Language Markup Framework and TEI Lex-0
This paper presents a methodology for digitizing the Al-Mawrid Arabic-English dictionary using ISO LMF and TEI Lex-0 standards, achieving high parsing accuracy and precision, and addressing gaps in Arabic lexical infrastructure.
Automated Scoring of Arabic Text Using Large Language Models: A Literature Review
A literature review examining LLM-based approaches for automatic scoring of Arabic text, covering short answer grading and essay scoring, with a proposed taxonomy and comparative analysis.
MentalMARBERT: Domain-Adaptive Pre-training and Two-Stage Fine-Tuning for Arabic Mental Health Disorders Detection
This paper presents MentalMARBERT, a domain-adapted Arabic language model for detecting mental health disorders from social media text. The framework uses domain-adaptive pre-training and a two-stage fine-tuning approach, achieving 0.877 accuracy and 0.861 macro-F1 on a newly constructed Arabic mental health dataset of 50,670 tweets.
Linear Semantic Segmentation for Low-Resource Spoken Dialects
This paper introduces a benchmark for semantic segmentation in low-resource dialectal Arabic and proposes a model that improves performance on conversational speech compared to standard baselines.