Tag
Modal has announced that replicas of vLLM and SGLang servers now start up 3-10x faster, leveraging improvements in GPU health management and CUDA context checkpointing.