segmentation

#segmentation

MAOAM: Unified Object and Material Selection with Vision-Language Models

Hugging Face Daily Papers ↗ · 2026-06-02 Cached

This paper presents MAOAM, a unified vision-language model framework that enables precise object and material selection through text or click interactions for interactive image editing. It introduces a scalable data generation pipeline and shows emergent improvement when combining text and clicks at inference.

0 favorites 0 likes

#segmentation

One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation

Hugging Face Daily Papers ↗ · 2026-05-28 Cached

Group Prompting introduces a training-free framework for cell instance segmentation that requires only one click per cell type, using the Segment Anything Model's feature space to recursively expand prompts, achieving competitive performance without training.

0 favorites 0 likes

#segmentation

InstructSAM: Segment Any Instance with Any Instructions

Hugging Face Daily Papers ↗ · 2026-05-25 Cached

InstructSAM presents a unified framework for multi-instance segmentation using instruction-driven queries that bridge vision-language models and SAM3, achieving strong results across complex benchmarks.

0 favorites 0 likes

#segmentation

Semantic Generative Tuning for Unified Multimodal Models

Hugging Face Daily Papers ↗ · 2026-05-18 Cached

Introduces Semantic Generative Tuning (SGT), a paradigm that uses image segmentation as a generative proxy to align visual understanding and generation in unified multimodal models, improving both comprehension and fidelity.

0 favorites 0 likes

#segmentation

AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting

Hugging Face Daily Papers ↗ · 2026-05-14 Cached

AuralSAM2 integrates audio into SAM2 via an AuralFuser module that generates sparse and dense prompts from audio-visual features, enhancing cross-modal segmentation while maintaining interactive efficiency.

0 favorites 0 likes

#segmentation

From Pixels to Concepts: Do Segmentation Models Understand What They Segment?

Hugging Face Daily Papers ↗ · 2026-05-10 Cached

Introduces CAFE, a benchmark for evaluating whether promptable segmentation models truly understand concepts by using counterfactual attribute manipulation, revealing that accurate mask prediction does not guarantee faithful semantic grounding.

0 favorites 0 likes

#segmentation

TwinTrack: Post-hoc Multi-Rater Calibration for Medical Image Segmentation

Hugging Face Daily Papers ↗ · 2026-04-17 Cached

TwinTrack is a post-hoc calibration framework for pancreatic cancer segmentation that aligns ensemble model probabilities with the empirical mean human response across multiple annotators, improving interpretability and calibration metrics on multi-rater benchmarks.

0 favorites 0 likes

#segmentation

Falcon Perception

Hugging Face Blog ↗ · 2026-04-01 Cached

Falcon Perception is a 0.6B-parameter early-fusion Transformer model released by TII UAE for open-vocabulary grounding and segmentation from natural language prompts, utilizing hybrid attention and specialized heads.

0 favorites 0 likes

segmentation

Submit Feedback