world-knowledge

#world-knowledge

@jakevin7: An interesting thing. The DeepSeek V4 technical report conducted a comprehensive evaluation of all major LLMs, concluding that Gemini 3.1 Pro has the strongest world knowledge among all models. Not GPT, not Claude, but Gemini. But when people use Gemini...

X AI KOLs Following ↗ · yesterday Cached

According to the DeepSeek V4 technical report's evaluation of mainstream LLMs, Gemini 3.1 Pro is considered to have the strongest world knowledge, but users generally find it hard to use because the model does not proactively use search tools.

0 favorites 0 likes

#world-knowledge

@dair_ai: How far are we from agents that can self-generate world knowledge? The work proposes an outcome-based reward that measu…

X AI KOLs Following ↗ · 2026-04-22 Cached

A new paper introduces an outcome-based reward that quantifies how self-generated world knowledge boosts task success, enabling agents to improve without external guidance at inference.

0 favorites 0 likes

#world-knowledge

Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration

Hugging Face Daily Papers ↗ · 2026-04-20 Cached

This paper proposes a method to train LLM agents with intrinsic meta-evolution capabilities, enabling spontaneous self-improvement without external rewards at inference time. Applied to Qwen3-30B and Seed-OSS-36B, the approach yields a 20% performance boost on web navigation benchmarks, with a 14B model outperforming Gemini-2.5-Flash.

0 favorites 0 likes

world-knowledge

@jakevin7: An interesting thing. The DeepSeek V4 technical report conducted a comprehensive evaluation of all major LLMs, concluding that Gemini 3.1 Pro has the strongest world knowledge among all models. Not GPT, not Claude, but Gemini. But when people use Gemini...

@dair_ai: How far are we from agents that can self-generate world knowledge? The work proposes an outcome-based reward that measu…

Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration

Submit Feedback