Tag
This paper presents RRB-Trees, a data structure for efficient immutable vectors, enabling logarithmic time concatenation and slicing.
This blog post introduces Magnitude-Direction (MD) Decoupling, a method that separates neural network weight matrices into direction and magnitude components optimized with separate learning rates. Experiments show improved performance across Adam and Muon optimizers, automatic learning rate transfer across model widths, and scaling benefits in large Mixture-of-Experts models.
EPFL researchers developed Synthegy, an AI framework that uses large language models to guide chemical retrosynthesis and reaction mechanism analysis through natural language instructions, significantly improving strategic planning for chemists.