Tag
EdgeBench reveals a new scaling law indicating that on-the-fly AI learning speed doubles every three months.
The paper argues that data-driven machine learning systems, including GPT-5, cannot achieve symbolic-level logical reasoning through scaling alone, due to inherent limitations in distinguishing logical structures from statistical regularities.
Sakana AI releases Fugu, a multi-agent orchestration system with only 0.6B parameters. By intelligently splitting tasks and coordinating multiple models, it achieves state-of-the-art performance while bypassing traditional parameter scaling. This marks the transition of multi-agent orchestration from a lab curiosity to a practical productivity tool.
A summary of the LatePost interview, reviewing Baidu US R&D's early AI布局, including investing in Cerebras, nearly investing in OpenAI and Anthropic, and the flow of talent from Baidu to these companies.
The article recounts Baidu Research US's investment in Cerebras, a wafer-scale chip company, a decade ago. It analyzes the shift in the AI chip market from training to inference and the importance of non-consensus investments.
This paper demonstrates that the weight norm causally controls the timescale of grokking in neural networks, reconciling conflicting accounts. Through interventions, it shows that grokking follows an exponential delay law and that norm magnitude dominates grokking time over learning rate across architectures.
Channel AI founder Luke Orthwine proposes a new software development methodology: shifting programming thinking from traditional chess-like single-threaded linear thinking to real-time strategy game (RTS) style high concurrency, macro scheduling, and saturation attack to achieve efficient development in the AI Agent era.
This article explores the deep connections between physics and deep learning, analyzes the isomorphism of phenomena such as Scaling Law and emergence with concepts like critical scaling laws and phase transitions in physics, and reviews the current status and prospects of applying physical methodologies in AI.
StreamMA introduces a streaming communication paradigm for multi-agent reasoning that pipelines intermediate results to reduce latency and improve effectiveness by leveraging more reliable early steps, outperforming baselines across benchmarks and revealing a step-level scaling law.
Huawei unveils the Tau Scaling Law, a chip architectural workaround to bypass US sanctions and achieve 1.4nm-equivalent transistor density by 2031, marking a significant step toward China's semiconductor self-sufficiency and altering the tech rivalry with Washington.
Discusses that the mathematics used by AI is mainly linear algebra, calculus, etc., from before the 19th century, but emerging phenomena such as Scaling Law, emergent abilities, double descent, in-context learning, and representation geometry lack mathematical explanation. Analogizes to the clouds in physics in 1900, suggesting it may drive the development of 21st-century mathematics.
In the interview, Yao Shunyu proposed a contrarian view that pre-training has not hit a wall and Scaling Law has not reached its limit, claiming that most people who say it has hit a wall have bugs in their code.
After being laid off from Meta, Yuandong announced a new direction, raising $650 million to found neolab Recursive_SI with a valuation of $4.65 billion. In an interview, he shared insights on AI trends, LLM limitations, reinforcement learning, and research freedom.