Tag
This paper introduces attention-free latent memory and dynamic re-encoding to improve long-horizon predictions in Koopman autoencoders, reducing error accumulation on benchmark dynamical systems.
This paper provides a mathematical analysis of superposition in neural networks, deriving upper and lower bounds on L2 reconstruction loss for simple autoencoders with power activation functions, corroborating empirical findings by Elhage et al.
Physics-conforming Latent Twins is a framework for learning latent surrogate solution operators that enforce physical principles such as conservation laws and dissipative inequalities by design, using a constraint-transfer approach and structure-preserving latent dynamics.
Introduces Rational Sparse Autoencoder (RSAE), which replaces fixed encoder activations with trainable rational functions, improving reconstruction and sparsity trade-offs on residual-stream activations of open-weight language models across multiple baseline families.
The author shares their work on reducing the cost of multi-vector retrieval by using k-means as top-1 sparse coding. Omar Khattab adds that late-interaction sparse retrieval with neuron-level inverted indexing on unsupervised sparse autoencoders works well.
This paper proposes Single-stage Sparse Retrieval (SSR), which replaces K-means clustering with sparse autoencoders and inverted indexing, achieving 15x faster indexing and halved retrieval latency while improving accuracy on the BEIR benchmark.
This article introduces Prior-Aligned Autoencoders (PAE), a new method for creating diffusion-friendly latent manifolds that achieves state-of-the-art image generation quality while enabling 13x faster training convergence.
An educational blog post explaining the Vector Quantized Variational Autoencoder (VQ-VAE) architecture, a key component of OpenAI's DALL-E image generation model.