Tag
Researchers from Kennesaw State University investigate cross-prompt generalization in detecting AI-generated fake news using interpretable linguistic features (lexical diversity, readability, emotion). A random forest classifier trained on one prompting strategy and tested on another achieves AUC values of 0.988–1.000, suggesting these features capture stable, generalizable properties of AI-generated text.
A large-scale empirical study analyzes 284 linguistic features across 27 LLMs and 10 text domains to assess which features reliably detect AI-generated text. The study finds that lexical richness measures are the most robust cross-domain and cross-model signals, while many other proposed indicators are strongly context-dependent.
A tweet announcing a tentative dating of Sanskrit literature at Dharmamitra.org based on linguistic features, with a link to the platform that uses AI (including Gemini API) to support scholarly study and translation of ancient texts.
This paper investigates how training alignment objectives reshape linguistic features in large language models, finding that instruction-tuned systems collapse language entropy significantly more than scale would suggest, and that entropy regularization can mitigate this collapse.