Tag
SCAPE is a communication-efficient distributed optimizer that leverages first-moment statistics to enable extreme sparsification for LLM training, preserving accuracy while reducing wall-clock time by up to 43.3%.