@jacobli99: Studying gives us a second curve: expertise as a function of study compute. You could consider its weighted area a noti…

X AI KOLs Following 06/17/26, 05:13 PM Papers

continual-learning machine-studying expertise study-compute domain-adaptation research-problem

Summary

Introduces the concept of 'Machine Studying' as a problem of developing expertise from a corpus of documents, distinct from continual learning.

Continual learning is widely discussed right now, but mostly as improving on the job or avoiding catastrophic forgetting. But it has a different, difficult, and already urgent form: Given nothing but a corpus of documents, how should AI systems develop expertise in a new, unfamiliar domain? We call this problem Machine Studying.

Original Article

View Cached Full Text

Cached at: 06/18/26, 06:10 PM

Continual learning is widely discussed right now, but mostly as improving on the job or avoiding catastrophic forgetting. But it has a different, difficult, and already urgent form:

Given nothing but a corpus of documents, how should AI systems develop expertise in a new, unfamiliar domain? We call this problem Machine Studying.

Humans face this problem of learning new domains constantly, and one of our default answers is studying.

Before an exam, even an open-book one, we read the textbook or the literature, think out loud, quiz ourselves, and write our own notes. Much of the expertise comes from the active effort of reading and thinking itself.

In contrast, current agents mostly rely on inference compute to understand the corpus while they’re working. We’d love for AI systems to be able to study from nothing but a corpus of documents that exist naturally on some subject, and to do so as efficiently as a person who learns a programming library largely by reading the documentation and a few tutorials and only a little bit of practice.

To compare procedures for machine studying, we start by defining expertise.

The corpus is always available at test time anyway, so a sufficiently intelligent non-expert agent could in principle always study during the exam.

What distinguishes an expert from a smart novice is a shift of the entire quality/cost curve: higher accuracy at the same budget, or the same accuracy at a smaller budget. We call the (appropriately weighted) area “expertise”, and the goal of studying is to raise expertise.

Studying gives us a second curve: expertise as a function of study compute. You could consider its weighted area a notion of “intelligence”. An intelligent agent, for our purposes, is one that can acquire expertise in totally new domains really efficiently. (And by this token, it’s not obvious that even the most knowledgeable of current agents are very smart!)

We instantiate this in StudyBench, a benchmark we’re building so that we and others can begin to ask questions about agents’ ability to study. StudyBench consists of three tasks, each built on a corpus that defines a domain of expertise, and each paired with a hidden exam. Below is an example coding question and rubric from our DSPy exam.

So how should an agent actually study? We test the simplest version of three natural bets out there: (1) self-supervised objectives, e.g. continual pre-training over the corpus; (2) synthesizing your own training data, e.g. turning the corpus into Q&A pairs to fine-tune on; and (3) amortized context management, e.g. having the agent write itself a cheatsheet.

Our preliminary finding is that none of these reliably turns exposure to the corpus into as much expertise as we’d like yet. Neither memorization of the contents of the corpus nor the ability to retrieve things stands in as a substitute for deeper expertise.

Read the blog for the details on all three paradigms, and what we think it’ll take to do better!

@jacobli99: Studying gives us a second curve: expertise as a function of study compute. You could consider its weighted area a noti…

Similar Articles

@jacobli99: To compare procedures for machine studying, we start by defining expertise. The corpus is always available at test time…

@lateinteraction: putting the link here for those that want to jump right into the long form: https://jacobxli.com/blog/2026/machine-stud…

@jacobli99: Continual learning is widely discussed right now, but mostly as improving on the job or avoiding catastrophic forgettin…

@DSPyOSS: a crisper operationalization of continual learning that matches problems that are inaccurately treated as "RAG" or "RL"…

@jacobli99: If we are ever to build machines that can operate in new domains like experts, either we must reduce each domain to a s…

Submit Feedback

Similar Articles

@jacobli99: To compare procedures for machine studying, we start by defining expertise. The corpus is always available at test time…

@lateinteraction: putting the link here for those that want to jump right into the long form: https://jacobxli.com/blog/2026/machine-stud…

@jacobli99: Continual learning is widely discussed right now, but mostly as improving on the job or avoiding catastrophic forgettin…

@DSPyOSS: a crisper operationalization of continual learning that matches problems that are inaccurately treated as "RAG" or "RL"…

@jacobli99: If we are ever to build machines that can operate in new domains like experts, either we must reduce each domain to a s…