human-feedback

Tag

Cards List
#human-feedback

@oshaikh13: very cool idea @OpenAI I’m really excited about this research preview- learning from how people interact with their com…

X AI KOLs Following · 2026-04-20

An OpenAI research preview explores learning from how people interact with their computers beyond chat, accompanied by a new arxiv paper on the topic.

0 favorites 0 likes
#human-feedback

WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback

arXiv cs.CL · 2026-04-20 Cached

WildFeedback is a novel framework that leverages in-situ user feedback from actual LLM conversations to automatically create preference datasets for aligning language models with human preferences, addressing scalability and bias issues in traditional annotation-based alignment methods.

0 favorites 0 likes
#human-feedback

AI-written critiques help humans notice flaws

OpenAI Blog · 2022-06-13 Cached

OpenAI trained language models to write critiques of text summaries, helping human evaluators spot flaws more effectively — a step toward scalable oversight of AI systems on difficult tasks. The work explores how AI-assisted feedback can improve human evaluation quality as a proof of concept for alignment research.

0 favorites 0 likes
#human-feedback

Summarizing books with human feedback

OpenAI Blog · 2021-09-23 Cached

OpenAI presents a scalable alignment technique using hierarchical summarization of entire books with human feedback, demonstrating how models can be trained to act in accordance with human intentions on complex, difficult-to-evaluate tasks.

0 favorites 0 likes
#human-feedback

Learning to summarize with human feedback

OpenAI Blog · 2020-09-04 Cached

OpenAI demonstrates a technique for improving language model summarization by training a reward model on human preferences and fine-tuning models with reinforcement learning, achieving significant quality improvements that generalize across datasets. This work advances model alignment through human feedback at scale, with applications beyond summarization.

0 favorites 0 likes
#human-feedback

Fine-tuning GPT-2 from human preferences

OpenAI Blog · 2019-09-19 Cached

OpenAI demonstrates fine-tuning GPT-2 (774M parameters) using human preference feedback for text continuation and summarization tasks, requiring 5k labels for stylistic tasks and 60k for summarization, with models achieving 86-88% human preference rates though revealing labeler heuristic exploitation.

0 favorites 0 likes
#human-feedback

Learning complex goals with iterated amplification

OpenAI Blog · 2018-10-22 Cached

OpenAI presents iterated amplification, a method for training AI systems on complex tasks by recursively decomposing them into smaller subtasks that humans can judge and solve, building up training signals from scratch through iterative composition.

0 favorites 0 likes
#human-feedback

Gathering human feedback

OpenAI Blog · 2017-08-03 Cached

OpenAI releases RL-Teacher, an open-source tool for training AI systems through human feedback instead of hand-crafted reward functions, with applications to safe AI development and complex reinforcement learning problems.

0 favorites 0 likes
#human-feedback

Learning from human preferences

OpenAI Blog · 2017-06-13 Cached

OpenAI presents a method for training AI agents using human preference feedback, where an agent learns reward functions from human comparisons of behavior trajectories and uses reinforcement learning to optimize for the inferred goals. The approach demonstrates strong sample efficiency, requiring less than 1000 bits of human feedback to train an agent to perform a backflip.

0 favorites 0 likes
← Back to home

Submit Feedback