Tag
dMX is a differentiable mixed-precision quantization framework that learns optimal floating-point bit-width assignments per layer for LLMs, targeting the MXFP family of formats defined by the OCP standard. It uses continuous optimization with temperature-based annealing and a budget-aware regularization term, consistently outperforming KL-divergence heuristics on Llama, Qwen3, and SmolLM2 models.
Introduces DisjunctiveNet, a unified end-to-end framework for enforcing hard, input-dependent mixed integer linear constraints within neural networks via differentiable convexified optimization layers, achieving perfect rule satisfaction on real-world datasets.
Introduces Graph Normalization, a differentiable dynamical system for approximating Maximum Weight Independent Set, with convergence guarantees and applications in structured sparse attention and constrained optimization.