high-speed-inference

Tag

Cards List
#high-speed-inference

@zephyr_z9: This is super big I think this is the first useful speculative decoding method deployed on a big quasi frontier model M…

X AI KOLs Following · 4d ago Cached

Xiaomi MiMo releases MiMo-V2.5-Pro-UltraSpeed, achieving over 1,000 tokens per second on a 1 trillion parameter model using speculative decoding, the first practical deployment of such speed at scale.

0 favorites 0 likes
← Back to home

Submit Feedback