deep-learning

#deep-learning

Introducing Triton: Open-source GPU programming for neural networks

OpenAI Blog ↗ · 2021-07-28 Cached

OpenAI releases Triton 1.0, an open-source Python-like GPU programming language that enables researchers without CUDA experience to write highly efficient GPU kernels, achieving performance on par with expert-written CUDA code in as few as 25 lines.

0 favorites 0 likes

#deep-learning

Alien Dreams: An Emerging Art Scene

ML at Berkeley ↗ · 2021-06-30 Cached

The article highlights the emerging scene of AI-generated art using OpenAI's CLIP model as a steering mechanism for generative models, showcasing various examples of text-to-image outputs.

0 favorites 0 likes

#deep-learning

Neural Module Networks for Visual Question Answering

ML at Berkeley ↗ · 2021-03-10 Cached

This article explains the Neural Module Networks (NMN) architecture from the paper 'Deep Compositional Question Answering with Neural Module Networks,' detailing how it handles the compositional structure of visual question answering tasks by decomposing questions into modular steps.

0 favorites 0 likes

#deep-learning

Generative language modeling for automated theorem proving

OpenAI Blog ↗ · 2020-09-07 Cached

OpenAI presents GPT-f, a transformer-based automated theorem prover for the Metamath formalization language, which discovered new short proofs accepted into the main Metamath library — marking the first time a deep-learning system contributed proofs adopted by a formal mathematics community.

0 favorites 0 likes

#deep-learning

OpenAI Scholars 2020: Final projects

OpenAI Blog ↗ · 2020-07-09 Cached

OpenAI Scholars 2020 program concluded with final projects investigating GPT-2 grammar representation, model interpretability, and medical applications like epileptic seizure prediction. The program provides stipends and mentorship to underrepresented groups in machine learning.

0 favorites 0 likes

#deep-learning

PyTorch Distributed: Experiences on Accelerating Data Parallel Training

Papers with Code Trending ↗ · 2020-06-28 Cached

This paper details the design and optimization of PyTorch's distributed data parallel module, highlighting techniques like gradient bucketing and computation-communication overlap that enable near-linear scalability across 256 GPUs.

0 favorites 0 likes

#deep-learning

Jukebox

OpenAI Blog ↗ · 2020-04-30 Cached

OpenAI's Jukebox is a generative model that produces music as raw audio, including vocals and instruments, using a VQ-VAE for compression and hierarchical Sparse Transformer priors to handle long-range musical structure. It represents a significant step beyond symbolic music generation by operating directly in the raw audio domain.

0 favorites 0 likes

#deep-learning

OpenAI standardizes on PyTorch

OpenAI Blog ↗ · 2020-01-30 Cached

OpenAI announces it is standardizing on PyTorch as its primary deep learning framework to improve research productivity and GPU performance at scale. As part of the move, they released a PyTorch version of Spinning Up in Deep RL and plan to open-source PyTorch bindings for their blocksparse kernels.

0 favorites 0 likes

#deep-learning

Dota 2 with large scale deep reinforcement learning

OpenAI Blog ↗ · 2019-12-13 Cached

OpenAI Five became the first AI system to defeat Dota 2 world champions using large-scale deep reinforcement learning with self-play, demonstrating superhuman performance on a complex game with long time horizons and imperfect information.

0 favorites 0 likes

#deep-learning

Deep double descent

OpenAI Blog ↗ · 2019-12-05 Cached

OpenAI research reveals the 'double descent' phenomenon where test error exhibits a non-monotonic pattern as both model size and training steps increase, challenging traditional understanding of the bias-variance tradeoff in deep learning.

0 favorites 0 likes

#deep-learning

Testing robustness against unforeseen adversaries

OpenAI Blog ↗ · 2019-08-22 Cached

OpenAI researchers developed a method to evaluate neural network robustness against unforeseen adversarial attacks, introducing a new metric called UAR (Unforeseen Attack Robustness) that assesses model performance against unanticipated distortion types beyond the commonly studied Lp norms.

0 favorites 0 likes

#deep-learning

Transfer of adversarial robustness between perturbation types

OpenAI Blog ↗ · 2019-05-03 Cached

Researchers study how adversarial robustness transfers across different perturbation types in deep neural networks, evaluating 32 attacks of 5 types on ImageNet models. Results show that robustness to one perturbation type doesn't always transfer to others and may sometimes hurt robustness elsewhere.

0 favorites 0 likes

#deep-learning

MuseNet

OpenAI Blog ↗ · 2019-04-25 Cached

OpenAI released MuseNet, a deep neural network based on GPT-2 architecture that generates 4-minute musical compositions with 10 instruments by learning patterns from hundreds of thousands of MIDI files. The model can combine multiple music styles and blend them in novel ways.

0 favorites 0 likes

#deep-learning

Generative modeling with sparse transformers

OpenAI Blog ↗ · 2019-04-23 Cached

OpenAI introduces the Sparse Transformer, a deep neural network that improves the attention mechanism from O(N²) to O(N√N) complexity, enabling modeling of sequences 30x longer than previously possible across text, images, and audio. The model uses sparse attention patterns and checkpoint-based memory optimization to train networks up to 128 layers deep, achieving state-of-the-art performance across multiple domains.

0 favorites 0 likes

#deep-learning

How AI training scales

OpenAI Blog ↗ · 2018-12-14 Cached

OpenAI researchers discovered that the gradient noise scale, a simple statistical metric, predicts the parallelizability of neural network training across a wide range of tasks. They found that more complex tasks and more powerful models tolerate larger batch sizes, suggesting future AI systems can scale further through increased parallelization.

0 favorites 0 likes

#deep-learning

Quantifying generalization in reinforcement learning

OpenAI Blog ↗ · 2018-12-06 Cached

OpenAI trained 9 agents on the CoinRun environment with varying numbers of training levels to quantify generalization in reinforcement learning, finding substantial overfitting even with 16,000 training levels and that IMPALA-CNN architectures generalize significantly better than Nature-CNN baselines.

0 favorites 0 likes

#deep-learning

Glow: Better reversible generative models

OpenAI Blog ↗ · 2018-07-09 Cached

OpenAI introduces Glow, an improved reversible generative model that simplifies the RealNVP architecture by replacing fixed permutations with learned 1x1 convolutions, enabling better information flow and significant performance improvements.

0 favorites 0 likes

#deep-learning

OpenAI Five

OpenAI Blog ↗ · 2018-06-25 Cached

OpenAI Five is a reinforcement learning agent that masters Dota 2 through self-play training with curriculum learning and strategic randomization, progressing from random behavior to executing complex human-level strategies.

0 favorites 0 likes

#deep-learning

Requests for Research 2.0

OpenAI Blog ↗ · 2018-01-31 Cached

OpenAI releases 'Requests for Research 2.0,' a new batch of seven unsolved research problems encountered during their work, ranging from LSTM training exercises to distributed RL parameter averaging and transfer learning between Atari games. The initiative invites the broader research community to tackle these challenges.

0 favorites 0 likes

#deep-learning

Domain randomization and generative models for robotic grasping

OpenAI Blog ↗ · 2017-10-17 Cached

Researchers explore a data generation pipeline using domain randomization and procedurally generated objects to train a deep neural network for robotic grasp planning. The proposed autoregressive model achieves >90% success on unseen objects in simulation and 80% in the real world, despite being trained only on random simulated objects.

0 favorites 0 likes

deep-learning

Submit Feedback