heuristic-learning

Tag

Cards List
#heuristic-learning

@0xLogicrw: Former OpenAI post-training core member Jiayi Weng proposed a new reinforcement learning paradigm called "Heuristic Learning" in his personal capacity and open-sourced all experimental code. He used Codex (GPT-5.4) to repeatedly play the Atari game Breakout, but GPT-5.4 was never retrained...

X AI KOLs Timeline · yesterday

Former OpenAI researcher Jiayi Weng proposed a new paradigm called "Heuristic Learning", which uses large language models to generate and iteratively modify Python code to solve reinforcement learning tasks. Knowledge is stored in interpretable code rather than neural network parameters, effectively avoiding catastrophic forgetting. It has achieved excellent results on Atari and MuJoCo benchmarks and the code has been open-sourced.

0 favorites 0 likes
← Back to home

Submit Feedback