Tag
Discusses a technique for achieving truly serverless GPUs for AI inference by skipping full image loads on container start and instead loading images asynchronously.