@shawntenam: GEPA (http://github.com/gepa-ai/gepa) bumped Haiku 4.5 from 65% to 85% pass rate by auto-optimizing my prompt instructi…

X AI KOLs Timeline 04/20/26, 07:28 AM Tools

Summary

GEPA is an open-source tool that automatically optimizes prompt instructions using execution traces and scores, raising Claude Haiku 4.5's pass rate from 65% to 85% without requiring a model swap.

GEPA (http://github.com/gepa-ai/gepa) bumped Haiku 4.5 from 65% to 85% pass rate by auto-optimizing my prompt instructions with execution traces and scores. It's like targeted prompt tuning for CLAUDE.md files. No model swap needed. Builders, run it on yours.

Original Article

Similar Articles

@kapicode: I've been using Claude as the "human" prompting @opencode to rebuild reference projects, evaluating four LLMs on the sa…

X AI KOLs Following

An evaluation of four LLMs (Qwen, MiniMax, GLM) using Claude as a prompter for the Opencode agent tool reveals that a smaller local model (Qwen 27B on a 3090) outperforms a larger pruned model in coding quality and reliability.

@learnwithella: Self-improving Claude Code skills are f*cking ridiculous One loop → 10 test runs, scored against an eval, prompt rewrit…

X AI KOLs Timeline

Claude Code can auto-iterate prompts by running evals, rewriting, and keeping winners, boosting a hook-writer skill from 32/50 to 47/50 overnight.

@dhh: I've been driving GPT5.5 on low reasoning for the last week+ and it's very good, very efficient. Haven't been tempted t…

X AI KOLs Following

DHH praises the performance and efficiency of GPT-5.5 on low reasoning settings, noting it surpasses Opus and Kimi.

@HenryL_AI: Big update: @gepa_ai has now been officially integrated into A-Evolve (by community member)! We added GEPA as a new plu…

X AI KOLs Timeline

Community member integrated the GEPA evolution algorithm into A-Evolve as a plug-and-play component, letting any agent use GEPA with zero setup.

Dynamically allocating compute budget to hard set of problems and evolving the sections with Qwen-35B-A3B gets you near GPT-5.4-xHigh on HLE

Reddit r/LocalLLaMA

A method that dynamically allocates compute budget to hard problems using Qwen-35B-A3B achieves performance near GPT-5.4-xHigh on the HLE benchmark.