Nonlinear computation in deep linear networks
Summary
OpenAI research explores how nonlinear computation can emerge in deep linear networks, presenting theoretical and empirical analysis with code examples using TensorFlow.
View Cached Full Text
Cached at: 04/20/26, 02:56 PM
Similar Articles
Understanding neural networks through sparse circuits
OpenAI researchers present methods for training sparse neural networks that are easier to interpret by forcing most weights to zero, enabling the discovery of small, disentangled circuits that can explain model behavior while maintaining performance. This work aims to advance mechanistic interpretability as a complement to post-hoc analysis of dense networks and support AI safety goals.
Techniques for training large neural networks
OpenAI presents comprehensive techniques for training large neural networks across distributed GPU clusters, covering data parallelism, pipeline parallelism, tensor parallelism, and mixture-of-experts approaches to overcome engineering and scalability challenges.
AI and compute
OpenAI releases an analysis demonstrating that compute used in largest AI training runs has grown exponentially at a 3.4-month doubling time since 2012, representing a 300,000x increase and vastly outpacing Moore's Law. The analysis suggests this trend will likely continue and calls for increased academic AI research funding to address rising computational costs.
AI and efficiency
OpenAI analyzes trends in AI algorithmic efficiency, showing that compute required to reach AlexNet-level performance has halved roughly every 16 months since 2012, outpacing hardware gains. The study draws comparisons across domains like DNA sequencing and transistor density to contextualize AI progress.
Trading inference-time compute for adversarial robustness
OpenAI presents evidence that reasoning models like o1 become more robust to adversarial attacks when given more inference-time compute to think longer. The research demonstrates that increased computation reduces attack success rates across multiple task types including mathematics, factuality, and adversarial images, though significant exceptions remain.