automated-experimentation

Tag

Cards List
#automated-experimentation

Self-Harness: Harnesses That Improve Themselves

Hacker News Top · 2d ago Cached

Self-Harness introduces a new paradigm where LLM-based agents iteratively improve their own operating harness by mining model-specific weaknesses, proposing harness modifications, and validating them through regression testing, achieving substantial performance gains on Terminal-Bench-2.0 across multiple base models.

0 favorites 0 likes
#automated-experimentation

Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes

Hugging Face Daily Papers · 2026-05-07 Cached

This paper introduces an auto-research framework using specialist agents to iteratively refine training recipes through an empirical loop of code execution and feedback. The system autonomously improves performance on tasks like Parameter Golf and NanoChat without human intervention by leveraging lineage feedback.

0 favorites 0 likes
← Back to home

Submit Feedback