annealing

#annealing

MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

arXiv cs.AI ↗ · 2026-05-20 Cached

MOCHA introduces a multi-objective optimization method for LLM agent skills, using Chebyshev scalarization and exponential annealing to handle hard platform constraints and discover Pareto-optimal variants, achieving significant improvements over existing optimizers.

0 favorites 0 likes

#annealing

Hölder Policy Optimisation

Hugging Face Daily Papers ↗ · 2026-05-12 Cached

HölderPO introduces a generalized policy optimization framework that uses the Hölder mean for token-level probability aggregation in GRPO, with a dynamic annealing schedule to balance gradient concentration and variance. The method achieves state-of-the-art results on mathematical benchmarks (54.9% average, 7.2% relative gain over GRPO) and a 93.8% success rate on ALFWorld.

0 favorites 0 likes

annealing

MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

Hölder Policy Optimisation

Submit Feedback