Tag
InstructSAM presents a unified framework for multi-instance segmentation using instruction-driven queries that bridge vision-language models and SAM3, achieving strong results across complex benchmarks.