information-density

#information-density

Measuring information density in web pages from an LLM agent's perspective [R]

Reddit r/MachineLearning ↗ · 2026-05-08

This paper presents empirical measurements of information density in web pages from the perspective of LLM agents, using a curated benchmark of 100 URLs across five categories. It finds that structural extraction reduces token count by an average of 71.5% while preserving answer quality, and reveals an undocumented compression layer in Claude Code.

0 favorites 0 likes

#information-density

A Mechanism and Optimization Study on the Impact of Information Density on User-Generated Content Named Entity Recognition

arXiv cs.CL ↗ · 2026-04-22 Cached

ArXiv preprint identifies low information density as the root cause of NER performance collapse on noisy user-generated content and introduces the Window-Aware Optimization Module (WOM) that boosts F1 by up to 4.5% on WNUT2017.

0 favorites 0 likes

#information-density

Revisiting the Uniform Information Density Hypothesis in LLM Reasoning

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper revisits the Uniform Information Density (UID) hypothesis in the context of LLM reasoning, introducing an entropy-based framework to quantify information flow uniformity. Across seven reasoning benchmarks, the authors find that high-quality reasoning exhibits local uniformity in step transitions but global non-uniformity in trajectory structure, suggesting LLM reasoning differs fundamentally from human communication patterns.

0 favorites 0 likes

information-density

Measuring information density in web pages from an LLM agent's perspective [R]

A Mechanism and Optimization Study on the Impact of Information Density on User-Generated Content Named Entity Recognition

Revisiting the Uniform Information Density Hypothesis in LLM Reasoning

Submit Feedback