Tag
SupraLabs released Supra-A2A-Nano-Exp, a small any-to-any autoregressive model that unifies text and image tokenization into a single Transformer, serving as an educational prototype rather than a production-ready system.
SimPersona learns discrete buyer personas from raw clickstreams using a VQ-VAE and maps them to persona tokens for LLM-based web agents, achieving high conversion-rate alignment across many live storefronts.
This paper addresses the issue of dimensional collapse in VQ-VAEs, showing that representations often occupy a low-dimensional subspace. It proposes an 'AE Warm-Up' strategy that trains the model as an unquantized autoencoder first, which improves reconstruction quality and increases effective latent dimensionality.
An educational blog post explaining the Vector Quantized Variational Autoencoder (VQ-VAE) architecture, a key component of OpenAI's DALL-E image generation model.
OpenAI's Jukebox is a generative model that produces music as raw audio, including vocals and instruments, using a VQ-VAE for compression and hierarchical Sparse Transformer priors to handle long-range musical structure. It represents a significant step beyond symbolic music generation by operating directly in the raw audio domain.