learning-rate-decay

Tag

Cards List
#learning-rate-decay

I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R]

Reddit r/MachineLearning · 2026-05-21

RPS is a two-stage LLM post-training method inspired by neuroscience, combining curriculum learning with learning rate decay. Preliminary results show improved program synthesis reliability on Qwen3-8b compared to equal learning rate training.

0 favorites 0 likes
← Back to home

Submit Feedback