Adam's Law: Textual Frequency Law on Large Language Models

Papers with Code Trending 04/02/26, 12:00 AM Tools

Summary

This article introduces AdamOpt, an open-source tool based on 'Adam's Law' that optimizes prompts by replacing low-frequency words with high-frequency synonyms to reduce perplexity. It highlights the tool's bilingual support, offline capability, and practical performance improvements in text generation.

While textual frequency has been validated as relevant to human cognition in reading speed, its relatedness to Large Language Models (LLMs) is seldom studied. We propose a novel research direction in terms of textual data frequency, which is an understudied topic, to the best of our knowledge. Our framework is composed of three units. First, this paper proposes Textual Frequency Law (TFL), which indicates that frequent textual data should be preferred for LLMs for both prompting and fine-tuning. Since many LLMs are closed-source in their training data, we propose using online resources to estimate the sentence-level frequency. We then utilize an input paraphraser to paraphrase the input into a more frequent textual expression. Next, we propose Textual Frequency Distillation (TFD) by querying LLMs to conduct story completion by further extending the sentences in the datasets, and the resulting corpora are used to adjust the initial estimation. Finally, we propose Curriculum Textual Frequency Training (CTFT) that fine-tunes LLMs in an increasing order of sentence-level frequency. Experiments are conducted on our curated dataset Textual Frequency Paired Dataset (TFPD) on math reasoning, machine translation, commonsense reasoning and agentic tool calling. Results show the effectiveness of our framework.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 05/10/26, 06:34 AM

Paper page - Adam’s Law: Textual Frequency Law on Large Language Models

Source: https://huggingface.co/papers/2604.02176 Great paper! The insight that “high-frequency text → lower perplexity → better LLM performance” is beautifully simple yet powerful.

I was so inspired by this work that I built AdamOpt — an open-source tool that turns Adam’s Law into a practical, one-command optimization pipeline:

What it does:

adamopt optimize “your prompt” → automatically replaces low-frequency bottleneck words/phrases with higher-frequency synonyms Three modes: conservative (word-level, ≥99% fidelity), balanced (word+phrase), aggressive (full rewrite) Automatically locks entities, numbers, logic keywords, and constraints — semantic meaning is never broken Chinese & English bilingual, works offline, zero LLM API cost Real results from the tool:

“optical causation...azure...celestial firmament” → “light cause...blue...sky” — sfreq +2735% “详尽阐述” → “详细讲” — sfreq +48.8% (with “务必” and “3点” auto-locked) “In order to comprehend the methodology” → “to understand the way” — sfreq +2150% 85 tests passing, MIT licensed. Modules 1-2 are done; Modules 3-5 (semantic verification, batch SFT data processing with CTFT sorting, API & Web Demo) are open for contribution.

Repo:https://github.com/happyii/AdamOpt

If you’re working with prompts or fine-tuning data, give it a try. PRs, issues, and stars are all welcome — let’s make prompt optimization a solved problem. 🚀

Adam's Law: Textual Frequency Law on Large Language Models

Paper page - Adam’s Law: Textual Frequency Law on Large Language Models

Similar Articles

Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

Scaling laws for neural language models

Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking

Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text

Submit Feedback

Similar Articles

Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

Scaling laws for neural language models

Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking

Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text