llm-programming

#llm-programming

@0xLogicrw: Former OpenAI post-training core member Jiayi Weng proposed a new reinforcement learning paradigm called "Heuristic Learning" in his personal capacity and open-sourced all experimental code. He used Codex (GPT-5.4) to repeatedly play the Atari game Breakout, but GPT-5.4 was never retrained...

X AI KOLs Timeline ↗ · 2026-05-08

Former OpenAI researcher Jiayi Weng proposed a new paradigm called "Heuristic Learning", which uses large language models to generate and iteratively modify Python code to solve reinforcement learning tasks. Knowledge is stored in interpretable code rather than neural network parameters, effectively avoiding catastrophic forgetting. It has achieved excellent results on Atari and MuJoCo benchmarks and the code has been open-sourced.

0 favorites 0 likes

#llm-programming

@DSPyOSS: indeed it's all just signatures (specs), modules ("harnesses", "inference scaling"), and optimizers (learning algorithm…

X AI KOLs Following ↗ · 2026-04-20 Cached

A post reflecting on the DSPy framework's architecture built around signatures, modules, and optimizers, and noting its continued growth since 2022.

0 favorites 0 likes

llm-programming

@DSPyOSS: indeed it's all just signatures (specs), modules ("harnesses", "inference scaling"), and optimizers (learning algorithm…

Submit Feedback