code-reasoning

#code-reasoning

RASFT: Rollout-Adaptive Supervised Fine-Tuning for Reasoning

arXiv cs.LG ↗ · 5d ago Cached

RASFT is a novel supervised fine-tuning framework for large language models that adapts expert supervision based on the model's own reasoning capabilities, achieving better performance on mathematical and code reasoning benchmarks compared to standard SFT and reinforcement learning methods.

0 favorites 0 likes

#code-reasoning

LongAttnComp: Cross-Family Context Compression for Long-Context Reasoning

Hugging Face Daily Papers ↗ · 2026-05-31 Cached

LongAttnComp adapts AttnComp for long-context reasoning by fine-tuning lightweight cross-attention layers and introducing token-level chunking, a top-p algorithm, positional reordering, and a query parser. It achieves strong performance on long-context tasks like code debugging and transfers across multiple model families.

0 favorites 0 likes

#code-reasoning

Improving LLM Code Reasoning via Semantic Equivalence Self-Play with Formal Verification

arXiv cs.CL ↗ · 2026-04-21 Cached

Researchers from University of Edinburgh propose a self-play framework using Liquid Haskell for formal verification to train LLMs on semantic equivalence reasoning, releasing OpInstruct-HSx dataset (28k programs) and achieving 13.3pp accuracy gains on EquiBench.

0 favorites 0 likes

code-reasoning

RASFT: Rollout-Adaptive Supervised Fine-Tuning for Reasoning

LongAttnComp: Cross-Family Context Compression for Long-Context Reasoning

Improving LLM Code Reasoning via Semantic Equivalence Self-Play with Formal Verification

Submit Feedback