Tag
This paper investigates how fine-tuning vision-language models to produce dense coordinate lists creates a controllable interference surface, finding that duplicate pressure can be removed without sacrificing localization accuracy.
This paper proposes a neuron-level intervention method to identify gender-specific neurons in language models (feminine, masculine, gender-neutral) and steer sentence generation toward a target gender form while preserving meaning, with experiments showing precise control and bias mitigation.
The paper identifies off-manifold drift in guided flow models under compositional rewards and proposes Conflict-Aware Additive Guidance (CAR), a lightweight method that dynamically resolves gradient conflicts to improve generation fidelity without retraining.
This paper introduces a novel adaptive scheduler for steering discrete diffusion language models using sparse autoencoders, demonstrating that targeting interventions based on when specific attributes commit improves control quality and strength over uniform methods.