Tag
This paper proposes a fine-grained concept bottleneck model framework that grounds each concept in localized visual evidence, enabling direct verification of concept correctness and improving transparency in medical imaging tasks.
This paper introduces DVMap, a framework for fine-grained pluralistic value alignment in LLMs that uses high-consensus demographic-value mapping instead of coarse national labels, achieving strong generalization across demographics, countries, and values.
Researchers from Peking University introduce CFMS, the first fine-grained Chinese multimodal sarcasm detection benchmark with 2,796 image-text pairs and a triple-level annotation framework (sarcasm identification, target recognition, explanation generation), along with a novel RL-augmented in-context learning method (PGDS) that significantly outperforms existing baselines.