Tag
KForge is a cross-platform framework that uses two collaborating LLM-based agents to automatically generate and optimize high-performance compute kernels for diverse AI accelerators, achieving significant speedups on NVIDIA B200 and Intel Arc B580 hardware.
This paper frames LLM-generated reward shaping for sparse structured RL as a debugging problem, identifying failure modes like reward flooding and semantic misunderstanding. The authors propose diagnostic-driven iterative refinement, achieving dramatic success rate improvements (e.g., DoorKey-8×8 from 2.3% to 97.6%) compared to one-shot generation.
This paper introduces the Iterative Refinement Neural Operator (IRNO), which augments pretrained neural operators with a learned refinement module applied via fixed-point iteration to mitigate spectral bias. IRNO progressively corrects high-frequency errors, achieving up to 56% improvement on turbulent flow and showing stable extrapolation beyond the trained iteration count.
Research Math Agents (RMA) is an agentic framework for automated reasoning on research-level mathematical problems, achieving state-of-the-art results on the First Proof benchmark by solving 8 out of 10 problems, outperforming strong baselines like GPT-5.2R and Aletheia.
This paper introduces a critique-and-routing controller for multi-agent LLM systems that formulates coordination as a sequential decision problem. It uses policy gradients to optimize the controller for iterative refinement, outperforming baselines while reducing reliance on top-tier models.
This paper introduces Attractor Models, which use fixed-point solving and implicit differentiation for efficient iterative refinement, achieving superior language modeling and reasoning performance with reduced computational costs compared to traditional transformers.
This paper proposes an epistemic state graph representation and an order-gap termination criterion for recursive reasoning systems, addressing how to manage evolving reasoning states and when to stop iteration.
The paper introduces WiCER, an iterative algorithm for compiling domain knowledge into LLM Wiki systems to minimize information loss and catastrophic failure rates during knowledge distillation. It demonstrates that this approach improves upon full-context KV cache inference by preserving critical facts better than blind compilation methods.
GPT-Image-2 now has the ability to review its own generated outputs and iteratively refine them until satisfied with correctness, though this process can take around 11 minutes per image.