kernel-co-design

Tag

Cards List
#kernel-co-design

Meta's Optimized RecSys Inference (58 minute read)

TLDR AI · 2026-05-08 Cached

Meta's In-Kernel Broadcast Optimization (IKBO) eliminates redundant user-embedding broadcast in RecSys inference via kernel-model-system co-design, delivering up to 2/3 latency reduction and ~4x speedup on H100 GPUs, and serving as the backbone for the Meta Adaptive Ranking Model.

0 favorites 0 likes
← Back to home

Submit Feedback