@jakevin7: An interesting thing. The DeepSeek V4 technical report conducted a comprehensive evaluation of all major LLMs, concluding that Gemini 3.1 Pro has the strongest world knowledge among all models. Not GPT, not Claude, but Gemini. But when people use Gemini...

X AI KOLs Following News

Summary

According to the DeepSeek V4 technical report's evaluation of mainstream LLMs, Gemini 3.1 Pro is considered to have the strongest world knowledge, but users generally find it hard to use because the model does not proactively use search tools.

An interesting thing. The DeepSeek V4 technical report conducted a comprehensive evaluation of all major LLMs, concluding that Gemini 3.1 Pro has the strongest world knowledge among all models. Not GPT, not Claude, but Gemini. But the general feeling people have when using Gemini is: Is this thing any good? The problem isn't with the model itself, but that it's extremely reluctant to take action. If you ask it about the latest news, it has search tools, but it just doesn't use them proactively. Most of the time, you have to explicitly tell it to go search before it does. It's like a well-read person who, when asked what's been happening recently, shrugs and says: I haven't read today's newspaper. A model with the strongest world knowledge that doesn't bother to use its tools — that's the real reason Gemini feels awkward to use.
Original Article
View Cached Full Text

Cached at: 06/08/26, 05:14 AM

There’s an interesting observation.

In its technical report, DeepSeek V4 conducted a comprehensive evaluation of all major mainstream models, concluding that Gemini 3.1 Pro has the strongest world knowledge among all models.

Not GPT, not Claude — it’s Gemini.

Yet the general user experience with Gemini is: does it even work well?

The issue isn’t the model itself; it’s that Gemini is extremely lazy to take action.

If you ask it about the latest news, it has a search tool, but it just won’t use it proactively. Many times you have to explicitly say “go search for it” before it bothers to look. It’s like a well-read person — you ask what’s been happening lately, and they shrug: “I haven’t read today’s newspaper.”

A model with the best world knowledge in existence, yet too lazy to use its tools — that’s the real reason Gemini feels so awkward to use.

Similar Articles

@wquguru: https://x.com/wquguru/status/2057852569054278045

X AI KOLs Timeline

Performed source code analysis and multi-model testing on the pi-goal tool, finding that DeepSeek V4 Pro is 31x cheaper and higher quality than Gemini 3.5 Flash on long-horizon tasks, and that higher thinking mode actually increases hallucination.

@RookieRicardoR: Domestic models break through again, matching top models like Claude 4.6 and Gemini 3.1 Pro. Just tested Qwen3.7-Max, sharing some real thoughts. Last night I topped up as soon as the API went live and chose three tasks (see video) to test Qwen3.7-Max's frontend capabilities…

X AI KOLs Timeline

The user tested Qwen3.7-Max and believes it matches top models like Claude 4.6 and Gemini 3.1 Pro in frontend, computing power, and Agent capabilities. Its reasoning ability has significantly improved, and with monthly iteration speed, it has become a first-tier domestic model.

Open source battle: GLM vs Kimi vs MiMo vs DeepSeek

Reddit r/LocalLLaMA

This article tests four open-source Chinese AI models — Zhipu GLM 5.1, Moonshot Kimi K2.6, Stepfun MIMO 2.5 Pro, and DeepSeek V4 Pro — on programming tasks. It finds that GLM leads overall in most tasks but not absolutely; each model has its own strengths and weaknesses.