Tag
The article discusses the Open/Closed problem in AI, drawing parallels between the historical evolution of GPU hardware from general-purpose to specialized ASICs, and the learning loop of AI models, contrasting open-loop training with closed-loop learning found in brains.
KDA is an agent-driven kernel design framework that helped HAN Lab achieve top rankings in the MLSys FlashInfer Kernel Contest by minimizing human involvement. The agent leverages Humanize, KernelWiki, and profiler skills to produce state-of-the-art kernels.
University of Washington SyFI team won multiple prizes at the FlashInfer AI Kernel Generation Contest held during MLSys2026, with support from NVIDIA and Modal.
UW SyFI Lab members won multiple prizes at the MLSys'26 competition (NVIDIA Track), including 1st place in GDN Track Full-Agent Approach, 2nd in GDN Track Agent-Assisted, and 3rd in DSA Track Full-Agent Approach.
Mark Saroufim gave a keynote at the MLSys conference covering the evolution of AI systems, why AI is needed to improve them, and promising future directions. The recording will be released soon.
BLASST, a training-free dynamic sparse attention mechanism using a single scalar threshold on online softmax statistics, won Best Paper at MLSys26. It achieves speedups of 1.52x for prefill and 1.48x for decode with over 70% sparsity while preserving accuracy.
ExecuTorch, PyTorch's on-device AI deployment framework, won the Best Industry Paper Award at MLSysConf 2026. The paper introduces a unified solution for running models on diverse hardware, from microcontrollers to SoCs.