Summarizing books with human feedback

OpenAI Blog Papers

Summary

OpenAI presents a scalable alignment technique using hierarchical summarization of entire books with human feedback, demonstrating how models can be trained to act in accordance with human intentions on complex, difficult-to-evaluate tasks.

Scaling human oversight of AI systems for tasks that are difficult to evaluate.
Original Article
View Cached Full Text

Cached at: 04/20/26, 02:55 PM

# Summarizing books with human feedback Source: [https://openai.com/index/summarizing-books/](https://openai.com/index/summarizing-books/) To safely deploy powerful, general\-purpose artificial intelligence in the future, we need to ensure that machine learning models act in accordance with human intentions\. This challenge has become known as the*alignment problem*\. A scalable solution to the alignment problem needs to work on tasks where model outputs are difficult or time\-consuming for humans to evaluate\. To test scalable alignment techniques, we trained a model to summarize entire books, as shown in the following samples\.[A](https://openai.com/index/summarizing-books/#citation-bottom-A)Our model works by first summarizing small sections of a book, then summarizing those summaries into a higher\-level summary, and so on\.

Similar Articles

Learning to summarize with human feedback

OpenAI Blog

OpenAI demonstrates a technique for improving language model summarization by training a reward model on human preferences and fine-tuning models with reinforcement learning, achieving significant quality improvements that generalize across datasets. This work advances model alignment through human feedback at scale, with applications beyond summarization.

AI-written critiques help humans notice flaws

OpenAI Blog

OpenAI trained language models to write critiques of text summaries, helping human evaluators spot flaws more effectively — a step toward scalable oversight of AI systems on difficult tasks. The work explores how AI-assisted feedback can improve human evaluation quality as a proof of concept for alignment research.

Our approach to alignment research

OpenAI Blog

OpenAI outlines their alignment research approach, highlighting reinforcement learning from human feedback (RLHF) as their primary technique for aligning deployed language models like InstructGPT. They discuss achieving significant preference over 100x larger models while using minimal compute, but acknowledge current limitations and propose a long-term strategy of using AI systems to accelerate alignment research beyond what humans can achieve alone.

Towards Human-Level Book-Writing Capability

arXiv cs.AI

This paper introduces a dataset and training framework that transforms human-authored novels into multi-resolution planning scaffolds, enabling long-context language models to generate book-scale fiction with more human-like prose and narrative dynamics.