low-rank

#low-rank

VarRate: Training-Free Variable-Rate KV Cache Compression for Long-Context LLMs

arXiv cs.CL ↗ · 15h ago Cached

Introduces VarRate, a training-free method for KV cache compression that allocates variable low-rank budget per token based on query salience, avoiding irreversible token eviction and outperforming uniform-rank methods at matched memory budgets on LongBench.

0 favorites 0 likes

#low-rank

Federated Low-Rank Koopman Learning for Multivariate Time-Series Anomaly Detection in IoT Systems

arXiv cs.LG ↗ · 2026-07-13 Cached

Proposes FedKAD, a federated Koopman anomaly detection framework for multivariate time series in IoT systems, using lightweight sliding-window Koopman representations and a Stiefel-ADMM algorithm for efficient communication and inference.

0 favorites 0 likes

#low-rank

@VukRosic99: Most KV-cache compression applies SVD to the keys alone, or embeds queries and keys jointly. Both miss the obvious targ…

X AI KOLs Timeline ↗ · 2026-07-10 Cached

KQ-SVD is a new method for KV-cache compression that directly approximates the attention matrix via optimal low-rank decomposition, achieving 5-10x lower error than key-only SVD on LLaMA and Mistral models.

0 favorites 0 likes

#low-rank

LACE-SVD: Loss-Aware SVD with Cumulative Error Correction for LLM Compression

arXiv cs.LG ↗ · 2026-07-07 Cached

LACE-SVD is a novel low-rank compression method for large language models that uses a loss-aware rank allocation strategy and a propagation-aware correction technique to mitigate cumulative error propagation in the residual stream, achieving better perplexity than prior SVD-based methods at high compression ratios.

0 favorites 0 likes

#low-rank

Training transformers where every layer W = V·Uᵀ from initialization reveals a corpus-determined optimal rank - looking for arXiv endorser (cs.LG) [D]

Reddit r/MachineLearning ↗ · 2026-07-03

This paper proposes Native Factorized Weights for transformers, where every linear layer is trained as a product of two low-rank matrices from initialization. Experiments show a corpus-determined optimal rank that minimizes validation loss and a generalization band, outperforming dense baselines with fewer parameters.

0 favorites 0 likes

#low-rank

DLR: Zero-Inference-Cost Latent Residuals for Low-Rank Pre-Training

arXiv cs.LG ↗ · 2026-06-30 Cached

Introduces Duplicated Latent Residual (DLR), a training-only, parameter-free plug-in for low-rank pre-training that improves perplexity across LLaMA models from 60M to 7B parameters, and can be folded into the model after training with zero inference cost.

0 favorites 0 likes

#low-rank

Low-rank Distributional Matrix Completion

arXiv cs.LG ↗ · 2026-06-04 Cached

This paper introduces a distributional generalization of matrix completion where each entry is a probability distribution rather than a scalar, using kernel mean embeddings and Tucker rank to capture low-rank structure. The authors propose a novel estimator with non-asymptotic error bounds and demonstrate effectiveness on synthetic and real-world data.

0 favorites 0 likes

#low-rank

Gradient-Free Training of Spiking Neural Networks via Low-Rank Evolution Strategies

arXiv cs.AI ↗ · 2026-06-01 Cached

Introduces Eggroll, a low-rank evolution strategy for gradient-free training of spiking neural networks, reducing memory and time overhead while achieving competitive accuracy on N-MNIST.

0 favorites 0 likes

#low-rank

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Hugging Face Daily Papers ↗ · 2026-05-28 Cached

VideoMLA replaces per-head KV caches in video diffusion models with a shared low-rank latent and decoupled 3D-RoPE positional keys, reducing per-token KV memory by 92.7% and improving throughput by 1.23x on a B200 while maintaining quality on VBench benchmarks.

0 favorites 0 likes

#low-rank

Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning

arXiv cs.LG ↗ · 2026-05-26 Cached

LoRDBA replaces LoRA's floating-point low-rank factors with binary sign carriers and channel-wise scales, enabling efficient on-device fine-tuning with significant footprint reduction and minimal latency overhead, matching fp16 quality.

0 favorites 0 likes

#low-rank

Modality-Decoupled Online Recursive Editing

arXiv cs.LG ↗ · 2026-05-21 Cached

Proposes M-ORE, a modality-decoupled online recursive editor for lifelong adaptation of multimodal large language models, addressing cross-modal conflict and inter-edit interference with constant per-edit overhead.

0 favorites 0 likes

#low-rank

Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity

arXiv cs.LG ↗ · 2026-05-21 Cached

This paper studies piecewise-stationary low-rank linear contextual bandits, proposes the SPSC algorithm that achieves dynamic regret scaling with the intrinsic rank instead of the ambient dimension, and characterizes the identification boundary for subspace recovery under scalar feedback.

0 favorites 0 likes

#low-rank

Orth-Dion: Eliminating Geometric Mismatch in Distributed Low-Rank Spectral Optimization

arXiv cs.LG ↗ · 2026-05-19 Cached

This paper identifies a geometric mismatch in the Dion low-rank spectral optimizer and proposes Orth-Dion, which replaces column normalization with QR orthogonalization to close the convergence gap to full-rank methods like Muon at the same communication cost, validated on large-scale language model pre-training.

0 favorites 0 likes

#low-rank

Δ-Mem: Efficient Online Memory for Large Language Models

Hacker News Top ↗ · 2026-05-16 Cached

Proposes delta-Mem, a lightweight online memory mechanism that uses a compact state matrix updated by delta-rule learning to improve long-context performance of frozen LLMs without full fine-tuning or context extension.

0 favorites 0 likes

#low-rank

Asymmetric Flow Models

Hugging Face Daily Papers ↗ · 2026-05-13 Cached

Asymmetric Flow Modeling (AsymFlow) restricts noise prediction to low-rank subspaces for efficient high-dimensional flow-based generation, achieving state-of-the-art results on ImageNet and text-to-image tasks by fine-tuning from latent flow models.

0 favorites 0 likes

low-rank

Submit Feedback