Tag
This paper introduces FLAS, a flow-based activation steering method that learns a concept-conditioned velocity field to steer language model activations at inference time. On the AxBench benchmark, FLAS is the first learned method to consistently outperform in-context prompting on held-out concepts without per-concept tuning.
PSRD framework halves multimodal hallucination in LVLMs by using phase-wise self-reward decoding and a distilled lightweight reward model without extra supervision.