Tag
PreFT proposes applying adapters only to prefill tokens, discarding them during decode, which increases throughput for multi-adapter serving with minimal performance loss.