Tag
Karpathy's critique of reward functions in RL is addressed by OpenPipe's ART framework using RULER, which allows natural language reward definitions evaluated by an LLM, replacing manual reward engineering.