Tag
Introduces Eggroll, a low-rank evolution strategy for gradient-free training of spiking neural networks, reducing memory and time overhead while achieving competitive accuracy on N-MNIST.
VideoMLA replaces per-head KV caches in video diffusion models with a shared low-rank latent and decoupled 3D-RoPE positional keys, reducing per-token KV memory by 92.7% and improving throughput by 1.23x on a B200 while maintaining quality on VBench benchmarks.
LoRDBA replaces LoRA's floating-point low-rank factors with binary sign carriers and channel-wise scales, enabling efficient on-device fine-tuning with significant footprint reduction and minimal latency overhead, matching fp16 quality.
Proposes M-ORE, a modality-decoupled online recursive editor for lifelong adaptation of multimodal large language models, addressing cross-modal conflict and inter-edit interference with constant per-edit overhead.
This paper studies piecewise-stationary low-rank linear contextual bandits, proposes the SPSC algorithm that achieves dynamic regret scaling with the intrinsic rank instead of the ambient dimension, and characterizes the identification boundary for subspace recovery under scalar feedback.
This paper identifies a geometric mismatch in the Dion low-rank spectral optimizer and proposes Orth-Dion, which replaces column normalization with QR orthogonalization to close the convergence gap to full-rank methods like Muon at the same communication cost, validated on large-scale language model pre-training.
Proposes delta-Mem, a lightweight online memory mechanism that uses a compact state matrix updated by delta-rule learning to improve long-context performance of frozen LLMs without full fine-tuning or context extension.
Asymmetric Flow Modeling (AsymFlow) restricts noise prediction to low-rank subspaces for efficient high-dimensional flow-based generation, achieving state-of-the-art results on ImageNet and text-to-image tasks by fine-tuning from latent flow models.