natural-language-processing

#natural-language-processing

The Future of NLP may not be at NLP Conferences: Scholarly Migration Patterns in Natural Language Processing

arXiv cs.CL ↗ · 11h ago Cached

A study analyzing 142K NLP papers from 2010–2026 finds that both established and new NLP authors are increasingly publishing in general ML venues like NeurIPS and ICLR rather than core NLP conferences like ACL, with a significant citation premium favoring ML venues.

0 favorites 0 likes

#natural-language-processing

BamiBERT: A New BERT-based Language Model for Vietnamese

arXiv cs.CL ↗ · 11h ago Cached

BamiBERT is a new BERT-based pre-trained language model for Vietnamese that addresses limitations of PhoBERT, supporting longer context and operating without word segmentation, achieving state-of-the-art results on multiple Vietnamese benchmarks.

0 favorites 0 likes

#natural-language-processing

RusFinChain: A Russian Benchmark for Verifiable Chain-of-Thought Reasoning in Finance with Fuzzy-Aligned Evaluation

arXiv cs.CL ↗ · 11h ago Cached

Introduces RusFinChain, the first Russian-language symbolic benchmark for verifiable chain-of-thought reasoning in finance, spanning 17 domains with 5,280 parameterized examples and enhanced evaluation metrics including fuzzy numeric alignment.

0 favorites 0 likes

#natural-language-processing

Svarna: An Open Corpus Workbench for Modern Greek

arXiv cs.CL ↗ · yesterday Cached

Svarna is an open-source web-based corpus workbench for Modern Greek, integrating multiple databases with over 507 million words and providing various linguistic analysis tools, released under MIT license.

0 favorites 0 likes

#natural-language-processing

The Course of News Events: A Comparison of Bottom-Up and Top-Down Approaches for Collecting Text-Based Data about Disasters

arXiv cs.CL ↗ · yesterday Cached

This paper compares top-down and bottom-up approaches for collecting text-based data about disasters from news articles, using German news about landslides as a case study.

0 favorites 0 likes

#natural-language-processing

Faithful by Definition: Emotion Analysis via Natural Semantic Metalanguage Explications

arXiv cs.CL ↗ · yesterday Cached

This paper proposes an emotion analysis interface using Natural Semantic Metalanguage (NSM) to generate faithful, interpretable explanations for emotion classifications, trading slight accuracy for verifiability.

0 favorites 0 likes

#natural-language-processing

Hate Speech Detection in Turkish and Arabic Languages: A Comprehensive Study

arXiv cs.CL ↗ · yesterday Cached

Introduces a comprehensive hate speech dataset for Turkish and Arabic, and develops state-of-the-art BERT-based models for hate speech analysis including classification, intensity prediction, target identification, and span detection.

0 favorites 0 likes

#natural-language-processing

LabGuard: Grounding Natural-Language Laboratory Rules into Runtime Guards for Embodied Laboratory Agents

arXiv cs.AI ↗ · 2d ago Cached

LabGuard introduces a framework that translates natural-language laboratory safety rules into executable runtime monitors for embodied agents, achieving a reduction in unsafe events from 39.5% to 23.8% while maintaining task success.

0 favorites 0 likes

#natural-language-processing

Beyond Compilation: Evaluating Faithful Natural-Language-to-Lean Statement Formalization

arXiv cs.AI ↗ · 2d ago Cached

This paper presents a benchmark and evaluation protocol for faithful natural-language-to-Lean statement formalization, revealing a 29-point gap between compile-pass and consensus-faithfulness, and decomposing the effects of expert drafting, context search, and elaboration feedback.

0 favorites 0 likes

#natural-language-processing

Exploring the relationship between team institutional composition and novelty in academic papers based on fine-grained knowledge entities

arXiv cs.CL ↗ · 2d ago Cached

This study examines how team institutional composition (academic, industrial, or mixed) affects the novelty of academic papers in NLP, using fine-grained knowledge entities like methods and datasets to measure novelty.

0 favorites 0 likes

#natural-language-processing

Linguistic Distancing on Social Media: Indicators of Emotion Regulation Across Age Groups

arXiv cs.CL ↗ · 2d ago Cached

This paper investigates linguistic distancing as an indicator of emotion regulation across age groups using social media text, finding that linguistic distancing increases with age, consistent with improved well-being in older adults.

0 favorites 0 likes

#natural-language-processing

Beyond Clean Text: Evaluating Encoder and Decoder Robustness for Bangla Event Detection in Noisy Text

arXiv cs.CL ↗ · 2d ago Cached

This paper introduces a Bangla event detection benchmark with noisy text (ASR, orthographic corruption) and evaluates encoder-only and decoder-only LLMs, finding decoder models more robust to noise.

0 favorites 0 likes

#natural-language-processing

A Hybrid Framework for Song Lyric Annotation Based on Human-LLM Alignment

arXiv cs.CL ↗ · 3d ago Cached

This paper introduces a hybrid framework for sentence-level emotion annotation of song lyrics that optimizes human and LLM collaboration by predicting misalignment, addressing subjectivity and scalability challenges in lyric emotion recognition.

0 favorites 0 likes

#natural-language-processing

BERTomelo: Your Portuguese Encoder Best Friend

arXiv cs.CL ↗ · 3d ago Cached

This paper introduces BERTomelo, a next-generation monolingual encoder pre-trained for Portuguese using the ModernBERT architecture, achieving superior performance on downstream tasks like STS and NER compared to previous Portuguese and multilingual models.

0 favorites 0 likes

#natural-language-processing

SEATauBench: Adapting Tool-Agent-User Evaluation Into Low-Resource Southeast Asian Languages

arXiv cs.CL ↗ · 3d ago Cached

Introduces SEATauBench, the first agent-focused evaluation framework for Southeast Asian languages, adapting τ²-Bench to Mandarin, Vietnamese, Thai, Indonesian, and Filipino, and reveals a significant capability gap when moving from English to localized settings.

0 favorites 0 likes

#natural-language-processing

Ontology-Guided Evidence Path Inference for Multi-hop Knowledge Graph Question Answering

arXiv cs.AI ↗ · 4d ago Cached

Proposes OPI, an ontology-guided framework for multi-hop knowledge graph question answering that leverages a relation-centric ontology graph for bidirectional retrieval and iterative refinement, achieving state-of-the-art results on multiple benchmarks.

0 favorites 0 likes

#natural-language-processing

Learning Complementary Action Modeling from Automotive Maintenance Instructions

arXiv cs.CL ↗ · 4d ago Cached

This paper introduces Complementary Action Modeling (CAM), a task that identifies or generates procedural counterparts of automotive maintenance instructions by modifying the action phrase while preserving context. Using a German automotive dataset, the authors examine candidate matching and controlled Seq2Seq generation to model these complementary instructions.

0 favorites 0 likes

#natural-language-processing

Term-Centric Hierarchy Induction from Heterogeneous Corpora

arXiv cs.CL ↗ · 2026-06-26 Cached

Proposes a term-centric framework for inducing hierarchical taxonomies from heterogeneous text sources, enabling cross-source alignment and interpretable hierarchies. Experiments on a multi-source benchmark demonstrate improved coherence and quality over text- and summary-based baselines.

0 favorites 0 likes

#natural-language-processing

Evaluation Pitfalls and Challenges in Multimedia Event Extraction

arXiv cs.CL ↗ · 2026-06-26 Cached

This paper presents a systematic analysis of evaluation pitfalls in multimedia event extraction, identifying issues such as inconsistent data processing, inconsistent task assumptions, and overly relaxed evaluation settings that can lead to overestimated performance.

0 favorites 0 likes

#natural-language-processing

Beyond Logical Forms: LLM-Extracted Patterns for Fallacy Classification

arXiv cs.CL ↗ · 2026-06-26 Cached

This paper proposes a framework for fallacy classification that uses LLMs to extract patterns from fallacious examples and their explanations, achieving statistically significant improvements over zero-shot baselines and demonstrating cross-dataset generalization.

0 favorites 0 likes

natural-language-processing

Submit Feedback