entropy-regularization

Tag

Cards List
#entropy-regularization

AI Sovereignty as National Learning Capacity: A Human-Centered Learning Mechanics Viewpoint on France, the United States, and China

arXiv cs.AI · 2026-06-02 Cached

This viewpoint paper proposes interpreting national AI development as a learning system using Human-Centered Learning Mechanics, arguing that AI sovereignty depends on a country's ability to regulate its information dynamics. It provides a mathematical model and policy implications for France, reframing AI policy as governance of a non-equilibrium learning system.

0 favorites 0 likes
#entropy-regularization

From Context Shift to Stylistic Collapse: Why Training Objectives Matter More Than Scale

arXiv cs.CL · 2026-05-29 Cached

This paper investigates how training alignment objectives reshape linguistic features in large language models, finding that instruction-tuned systems collapse language entropy significantly more than scale would suggest, and that entropy regularization can mitigate this collapse.

0 favorites 0 likes
#entropy-regularization

Refined Analysis of Entropy-Regularized Actor-Critic

arXiv cs.LG · 2026-05-26 Cached

This paper provides a refined theoretical analysis of actor-critic methods with entropy regularization, showing that an exact critic acts as a strong variance reducer and enables sample complexity comparable to deterministic policy gradient, and that with a sufficiently accurate learned critic the benefits are preserved.

0 favorites 0 likes
#entropy-regularization

Human-Centered Learning Mechanics: A Dynamical Framework for Entropy-Regulated Representation Learning

arXiv cs.LG · 2026-05-25 Cached

This paper proposes Human-Centered Learning Mechanics (HCLM), a dynamical and information-theoretic framework for studying open and controlled learning systems. It formalizes entropy regularization through effective information force, derives convergence and generalization results, and provides a conditional interpretation of scaling-law behavior.

0 favorites 0 likes
#entropy-regularization

Revisiting Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning

arXiv cs.CL · 2026-04-20 Cached

This paper proposes Adaptive Entropy Regularization (AER), a framework that dynamically balances exploration and exploitation in LLM reinforcement learning by addressing policy entropy collapse through difficulty-aware coefficient allocation and initial-anchored target entropy. Experiments on mathematical reasoning benchmarks demonstrate consistent improvements in both accuracy and exploration capability.

0 favorites 0 likes
#entropy-regularization

Equivalence between policy gradients and soft Q-learning

OpenAI Blog · 2017-04-21 Cached

OpenAI researchers demonstrate a precise mathematical equivalence between soft (entropy-regularized) Q-learning and policy gradient methods in reinforcement learning, providing theoretical insight into why Q-learning works despite inaccurate value estimates. They validate this equivalence empirically on the Atari benchmark and show a Q-learning method can closely match A3C's learning dynamics.

0 favorites 0 likes
← Back to home

Submit Feedback