PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications
Summary
PixelCNN++ introduces several architectural improvements to PixelCNN including discretized logistic mixture likelihood, downsampling, and shortcut connections, achieving state-of-the-art log likelihood results on CIFAR-10.
View Cached Full Text
Cached at: 04/20/26, 02:45 PM
Similar Articles
Improved Techniques for Training Consistency Models
OpenAI presents improved techniques for training consistency models that enable high-quality single-step image generation without distillation, achieving significant FID improvements on CIFAR-10 and ImageNet 64×64 through novel loss functions and training strategies.
Representations Before Pixels: Semantics-Guided Hierarchical Video Prediction
Re2Pix is a hierarchical video prediction framework that improves future video generation by first predicting semantic representations using frozen vision foundation models, then conditioning a latent diffusion model on these predictions to generate photorealistic frames. The approach addresses train-test mismatches through nested dropout and mixed supervision strategies, achieving improved temporal semantic consistency and perceptual quality on autonomous driving benchmarks.
An Optimal Transport-driven Approach for Cultivating Latent Space in Online Incremental Learning
This paper introduces MMOT, an online mixture model learning framework based on optimal transport theory that addresses incremental learning with distributional shifts through dynamic centroid updates and improved class similarity estimation. The approach includes a Dynamic Preservation strategy to mitigate catastrophic forgetting and maintain class separability in latent space.
Faster LLM Inference via Sequential Monte Carlo
This paper proposes Sequential Monte Carlo Speculative Decoding (SMC-SD), a method that accelerates LLM inference by replacing token-level rejection in speculative decoding with importance-weighted resampling over draft particles, achieving 2.36× speedup over standard speculative decoding and 5.2× over autoregressive decoding while maintaining 3% accuracy loss.
Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text
This paper addresses the degradation of likelihood-based machine-generated text detectors by identifying a Simpson's paradox in token-score aggregation. It proposes a learned local calibration step that significantly improves detection performance across various models and datasets.