lossless-optimization

Tag

Cards List
#lossless-optimization

TIDE: Efficient and Lossless MoE Diffusion LLM Inference with I/O-aware Expert Offload

Hugging Face Daily Papers · 2026-05-19 Cached

TIDE is a lossless inference system for diffusion large language models that leverages temporal stability of expert activations to reduce I/O overhead and computation, achieving up to 1.4-1.5x throughput improvements on single GPU-CPU systems.

0 favorites 0 likes
← Back to home

Submit Feedback