gpu-checkpointing

Tag

Cards List
#gpu-checkpointing

@charles_irl: Inference isn't everything, but it does require a new stack -- not Kubernetes, not SLURM. At @modal, we dove deep to bu…

X AI KOLs Following · yesterday Cached

Modal engineers detail their approach to achieving truly serverless GPUs for AI inference, combining cloud buffers, a custom content-addressed filesystem, and CPU/GPU checkpoint/restore to scale replicas in tens of seconds instead of minutes.

0 favorites 0 likes
← Back to home

Submit Feedback