hardware-constraints

#hardware-constraints

@jun_song: If we ever figure out how to load ONLY the active params of an MoE into the GPU instead of the full weights, it's game …

X AI KOLs Following ↗ · 2026-05-10

The author speculates that loading only active parameters of MoE models onto GPUs could drastically improve efficiency and allow running large models like Kimi locally, though acknowledges this is currently impractical.

0 favorites 0 likes

hardware-constraints

@jun_song: If we ever figure out how to load ONLY the active params of an MoE into the GPU instead of the full weights, it's game …

Submit Feedback