Tag
ART (Art-based Reinforcement Training) enables parameter-efficient fine-tuning of frozen multimodal LLMs by optimizing raw visual input via gradient backpropagation, achieving performance comparable to LoRA while supporting pre-compiled computational graphs for high-throughput engines like vLLM.