Tag
Proposes ProMSA, a progressive multimodal search agent for knowledge-based visual question answering that adaptively selects search strategies and optimizes through sequence-level reinforcement learning, achieving consistent gains on E-VQA and InfoSeek.
TOTEN is a knowledge-based ontological tokenization framework that replaces statistical tokenization with declarative classification grounded in a formal ontology of engineering entities, achieving high ontological atomicity and numerical reconstruction for physical quantities and technical notation in Brazilian Portuguese.