large-models

#large-models

@10xmylife: 不发展自己的大模型能行吗？

X AI KOLs Following ↗ · 2d ago Cached

Financial Times reports on China's AI talent war, highlighting Moonshot founder Yang Zhilin's ability to retain a top research team amid aggressive poaching by large tech companies.

0 favorites 0 likes

#large-models

What data mix are the labs using to train 10T param models?

Reddit r/singularity ↗ · 3d ago

Discussion about the data sources labs may use to train 10T parameter models, including synthetic reasoning chains and human-generated traces, amid concerns about hitting the data wall.

0 favorites 0 likes

#large-models

A Survey on the Green Development of Large Models: From Resource-Efficient Architectures to Hardware-Software Co-Design

arXiv cs.LG ↗ · 2026-07-13 Cached

This survey comprehensively reviews resource-efficient architectures and hardware-software co-design for green AI, covering efficient model construction, training/deployment strategies, and sustainable hardware, aiming to guide sustainable large model development.

0 favorites 0 likes

#large-models

AI-Model Network: Concept, Current State and Future

arXiv cs.AI ↗ · 2026-06-29 Cached

This paper proposes the concept of the world wide AI-Model Network (AI-ModelNet), a novel paradigm for interconnecting, sharing capabilities, and enabling collaborative reasoning among diverse large models. The authors review current single- and multi-model research, present a hierarchical architecture, and validate feasibility through a prototype system and application cases.

0 favorites 0 likes

#large-models

FastMix: Fast Data Mixture Optimization via Gradient Descent

arXiv cs.LG ↗ · 2026-06-16 Cached

FastMix is a novel framework that automates data mixture discovery for training large models using a single proxy model and bilevel optimization, achieving state-of-the-art performance with significant efficiency gains.

0 favorites 0 likes

#large-models

@LuBtc888: Give yourself one hour, and bridge the 5-year AI knowledge gap between you and others! DeepMind founder Demis Hassabis's 60-minute talk at Cambridge. About the next phase of AI: from large models, AlphaFold to scientific discovery and AGI. Chinese subtitles added, recommend saving to watch at your leisure.

X AI KOLs Timeline ↗ · 2026-06-09 Cached

DeepMind founder Demis Hassabis delivers a 60-minute speech at the University of Cambridge, covering the future development of AI from large models, AlphaFold to scientific discovery and AGI. Video has been added with Chinese subtitles.

0 favorites 0 likes

#large-models

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

Hugging Face Daily Papers ↗ · 2026-04-15 Cached

Survey introduces the Proxy Compression Hypothesis to explain how RLHF and related methods systematically induce reward hacking, deception, and oversight gaming in large language and multimodal models.

0 favorites 0 likes

large-models

@10xmylife: 不发展自己的大模型能行吗？

What data mix are the labs using to train 10T param models?

A Survey on the Green Development of Large Models: From Resource-Efficient Architectures to Hardware-Software Co-Design

AI-Model Network: Concept, Current State and Future

FastMix: Fast Data Mixture Optimization via Gradient Descent

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

Submit Feedback