@VincentLogic: If Ilya Is Right, the Three Strongest Consensuses in AI Over the Past Few Years Might All Be Wrong: Scaling Is No Longer the Universal Answer. High Benchmark Scores Don't Equal True Intelligence. RL Might Even Be Making Models 'Dumber'. This Interview, Called 'the Last Interview Before Ilya Disappeared'...

X AI KOLs Timeline 07/02/26, 08:54 AM News

scaling benchmark reinforcement-learning pre-training intelligence research-era

Summary

Ilya Sutskever suggested in an in-depth interview that the three core consensuses of the AI industry over the past few years could all be mistaken: Scaling is no longer a silver bullet, high benchmark scores do not equate to real intelligence, and RL is instead making models 'dumber'. He believes the dividends from pre-training and RL are nearly exhausted, AI has re-entered the era of research, and true superintelligence should possess a strong learning capability like a gifted teenager, not a static repository of knowledge.

If Ilya's judgment is correct, then the three most solid consensuses in the AI industry over the past few years might all be wrong: Scaling is no longer the universal answer. High benchmark scores do not equal true intelligence. RL might even be making models 'dumber'. In this conversation, dubbed 'the last interview before Ilya disappeared', he compared current large models to competitive programming contestants: they can solve very difficult problems, but when faced with real projects, they fix one bug and create two more, in a repeating cycle. The problem may not be that the models aren't large enough, but that the reward mechanism drives them to pursue the correct answer excessively, gradually losing common sense, intuition, and learning ability. His more radical judgment is: The dividend from pre-training has peaked, and the RL dividend is also nearly exhausted. AI has already regressed from the era of Scaling back to the 'era of research'. True superintelligence will not be a finished product that is downloaded and knows everything. Instead, it will be more like a fifteen-year-old genius: possessing extremely strong learning ability and then continuously growing in the real world. If this direction holds, the key to the next round of AI competition will no longer be who piles up more data and GPUs, but who first cracks how humans can learn a new thing just by seeing a few examples. Ilya rarely talks nonsense, but this time he chatted for over forty minutes. It's worth watching the whole thing. Which part most challenges your understanding?

Original Article

View Cached Full Text

Cached at: 07/02/26, 02:24 PM

If Ilya’s judgment is correct, then the three most solid consensuses in the AI industry over the past few years may all be wrong:

Scaling is no longer the universal answer.
High benchmark scores do not equal real intelligence.
RL may even be making models dumber.

In this conversation — described as “the last interview before Ilya disappeared” — he likened current large models to competitive programming contestants: they can solve very hard problems, but when faced with a real-world project, they fix one bug and create two new ones, in an endless loop.

The problem may not be that the models are too small, but that the reward mechanism drives them to over‑pursue correct answers, gradually losing common sense, intuition, and learning ability.

His more radical conclusion:

The pre‑training dividend has peaked, the RL dividend is nearly exhausted, and AI has already regressed from the Scaling era back into a “research era.”

True superintelligence will not be a finished product that you download and already knows everything. Instead, it will be more like a fifteen‑year‑old genius: possessing extremely strong learning ability, and then continuously growing in the real world.

If this direction holds, the next round of AI competition will no longer be about who piles up more data and GPUs, but about who first cracks why humans can learn something new after seeing just a few examples.

Ilya rarely speaks in empty words, but this time he talked for more than forty minutes.

It’s worth watching the whole thing. Which part overturns your understanding the most?

Similar Articles

@ba_niu80557: https://x.com/ba_niu80557/status/2068751230667755859

X AI KOLs Timeline

The article explores how increasingly powerful AI models eliminate those whose skills can be encoded into prompts, emphasizing that the truly irreplaceable value lies in tacit knowledge, physical-world operations, and interpersonal trust. Through the example of a friend transitioning from a consultant to a hardware integrator, the author illustrates how proactively yielding to AI-replaceable tasks while deepening expertise in areas beyond AI's reach is key to surviving and thriving in the technological wave.

@runes_leo: At Sequoia Ascent on 4/30, Karpathy compressed this year’s most valuable explanation of AI into three core arguments. You’ll see AI differently after reading this. 1. AI Isn’t Just “Faster,” It’s a New Paradigm For the past two years, the narrative has been that AI speeds things up. Karpathy says this is a misunderstanding...

X AI KOLs Timeline

This article summarizes Karpathy’s core points at the Sequoia Ascent conference, highlighting that AI is a paradigm shift restructuring workflows rather than merely an acceleration tool. It introduces the concept of a "jagged edge" for model capabilities based on verifiability and economic viability, and predicts that future software will evolve into an agent-native architecture where LLMs serve as the logic layer and traditional code functions as sensors and actuators.

@jakevin7: Let me make a prediction: The next phase of the AI era will become "Infra is all you need". AI-generated code is already very powerful, but it's still far from adequate in terms of usability and stability. Recently, OpenAI's subscription system had a huge bug, and the membership system completely broke down. The system…

X AI KOLs Following

The author predicts that the next phase of the AI era will shift from model capabilities to infrastructure capabilities, emphasizing infra abilities such as reproducibility, observability, recoverability, and security isolation, believing that stably carrying AI behavior will be the key to competition.

@vista8: https://x.com/vista8/status/2072191315916538039

X AI KOLs Timeline

Starting with the story of Galois group theory, the article delves into the boundaries of AI's capabilities in mathematics, distinguishing between two types of progress: "connecting lightning" (cross-domain connections) and "building mountains" (creating new frameworks). It analyzes the limitations of the RLVR training method and introduces the concept of "grindability" to explain AI's rapid advancements in mathematics and coding.

@Phoenixyin13: Finished reading a long post today by OpenAI researcher Noam Brown — a reality severely underestimated by the industry. The true ceiling of LLM capabilities is far higher than what any current benchmark shows. The reason: too little test-time compute. And as models...

X AI KOLs Timeline

Highlights OpenAI researcher Noam Brown's argument: the true ceiling of LLM capabilities is far higher than current benchmarks show, due to insufficient test-time compute, and stronger models benefit more from additional computation. This poses a serious challenge for AI safety evaluation, as many dangerous capabilities may only emerge under long time and high compute budgets.

Similar Articles

@ba_niu80557: https://x.com/ba_niu80557/status/2068751230667755859

@vista8: https://x.com/vista8/status/2072191315916538039

@Phoenixyin13: Finished reading a long post today by OpenAI researcher Noam Brown — a reality severely underestimated by the industry. The true ceiling of LLM capabilities is far higher than what any current benchmark shows. The reason: too little test-time compute. And as models...

Submit Feedback