Towards a Linguistic Evaluation of Narratives: A Quantitative Stylistic Framework

arXiv cs.CL Papers

Summary

A preprint proposes a 33-feature quantitative linguistic framework that distinguishes professionally edited from self-published books and outperforms existing story-level evaluation metrics.

arXiv:2604.19261v1 Announce Type: new Abstract: The evaluation of narrative quality remains a complex challenge, as it involves subjective factors such as plot, character development, and emotional impact. This work proposes a quantitative approach to narrative assessment by focusing on the linguistic dimension as a primary indicator of quality. The paper presents a methodology for the automatic evaluation of narrative based on the extraction of a comprehensive set of 33 quantitative linguistic features categorized into lexical, syntactic, and semantic groups. To test the model, an experiment was conducted on a specialized corpus of 23 books, including canonical masterpieces and self-published works. Through a similarity matrix, the system successfully clustered the narratives, distinguishing almost perfectly between professionally edited and self-published texts. Furthermore, the methodology was validated against a human-annotated dataset; it significantly outperforms traditional story-level evaluation metrics, demonstrating the effectiveness of quantitative linguistic features in assessing narrative quality.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/22/26, 08:30 AM

# Towards a Linguistic Evaluation of Narratives: A Quantitative Stylistic Framework
Source: [https://arxiv.org/abs/2604.19261](https://arxiv.org/abs/2604.19261)
[View PDF](https://arxiv.org/pdf/2604.19261)

> Abstract:The evaluation of narrative quality remains a complex challenge, as it involves subjective factors such as plot, character development, and emotional impact\. This work proposes a quantitative approach to narrative assessment by focusing on the linguistic dimension as a primary indicator of quality\. The paper presents a methodology for the automatic evaluation of narrative based on the extraction of a comprehensive set of 33 quantitative linguistic features categorized into lexical, syntactic, and semantic groups\. To test the model, an experiment was conducted on a specialized corpus of 23 books, including canonical masterpieces and self\-published works\. Through a similarity matrix, the system successfully clustered the narratives, distinguishing almost perfectly between professionally edited and self\-published texts\. Furthermore, the methodology was validated against a human\-annotated dataset; it significantly outperforms traditional story\-level evaluation metrics, demonstrating the effectiveness of quantitative linguistic features in assessing narrative quality\.

## Submission history

From: Alessandro Maisto \[[view email](https://arxiv.org/show-email/2b9a5b8d/2604.19261)\] **\[v1\]**Tue, 21 Apr 2026 09:21:40 UTC \(827 KB\)

Similar Articles

Reward Modeling for Scientific Writing Evaluation

arXiv cs.CL

This paper proposes SciRM, cost-efficient open-source reward models tailored for evaluating scientific writing through a two-stage training framework that optimizes evaluation preferences and reasoning capabilities. The models generalize across diverse scientific writing tasks without requiring task-specific retraining, addressing limitations of existing LLM-based judges on domain-specific evaluation criteria.

SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation

arXiv cs.CL

SwanNLP presents an LLM-based framework for plausibility scoring in narrative word sense disambiguation at SemEval-2026 Task 5, using structured reasoning and dynamic few-shot prompting to predict human-perceived plausibility of word senses in short stories. The work demonstrates that commercial large-parameter LLMs with few-shot prompting and model ensembling effectively replicate human judgment patterns in realistic narrative contexts.