OpenAI introduces Glow, an improved reversible generative model that simplifies the RealNVP architecture by replacing fixed permutations with learned 1x1 convolutions, enabling better information flow and significant performance improvements.
We introduce Glow, a reversible generative model which uses invertible 1x1 convolutions. It extends previous work on reversible generative models and simplifies the architecture. Our model can generate realistic high resolution images, supports efficient sampling, and discovers features that can be used to manipulate attributes of data. We’re releasing code for the model and an online visualization tool so people can explore and build on these results.
# Glow: Better reversible generative models
Source: [https://openai.com/index/glow/](https://openai.com/index/glow/)
Our main contribution and also our departure from the earlier RealNVP work is the addition of a reversible 1x1 convolution, as well as removing other components, simplifying the architecture overall\.
The RealNVP architecture consists of sequences of two types of layers: layers with checkboard masking, and layers with channel\-wise masking\. We remove the layers with checkerboard masking, simplifying the architecture\. The layers with channel\-wise masking perform the equivalent of a repetition of the following steps:
1. Permute the inputs by reversing their ordering across the channel dimension\.
2. Split the input into two parts, A and B, down the middle of the feature dimension\.
3. Feed A into a shallow convolutional neural network\. Linearly transform B according to the output of the neural network\.
4. Concatenate A and B\.
By chaining these layers, A updates B, then B updates A, then A updates B, etc\. This bipartite flow of information is clearly quite rigid\. We found that model performance improves by changing the reverse permutation of step \(1\) to a \(fixed\)*shuffling*permutation\.
Taking this a step further, we can also*learn*the optimal permutation\. Learning a permutation matrix is a discrete optimization that is not amendable to gradient ascent\. But because the permutation operation is just a special case of a linear transformation with a square matrix, we can make this work with convolutional neural networks, as permuting the channels is equivalent to a 1x1 convolution operation with an equal number of input and output channels\. So we replace the fixed permutation with learned 1x1 convolution operations\. The weights of the 1x1 convolution are initialized as a random rotation matrix\. As we show in the figure below, this operation leads to significant modeling improvements\. We’ve also shown that the computations involved in optimizing the objective function can be done efficiently through a LU decomposition of the weights\.
OpenAI introduces Consistency Models, a new family of generative models that enable fast one-step image generation by directly mapping noise to data, while supporting multi-step sampling and zero-shot editing tasks like inpainting and super-resolution. The approach achieves state-of-the-art FID scores on CIFAR-10 and ImageNet 64x64 for one-step generation.
LaviGen is a framework that repurposes 3D generative models for autoregressive 3D layout generation, using an adapted 3D diffusion model with dual-guidance self-rollout distillation to achieve 19% higher physical plausibility and 65% faster computation than state-of-the-art methods on the LayoutVLM benchmark.
Naver AI introduces Stable-GFlowNet, a method to improve LLM red-teaming by eliminating unstable partition function estimation in Generative Flow Networks through contrastive trajectory balance.