segmentation

Tag

Cards List
#segmentation

FBK's Long-form SpeechLLMs for IWSLT 2026 Instruction Following

arXiv cs.CL · 3d ago Cached

This paper describes FBK's submission to the IWSLT 2026 Instruction Following shared task, developing SpeechLLMs for short-form and long-form speech instruction following, exploring segmentation methods and achieving robust long-form performance with fixed 30-second segmentation.

0 favorites 0 likes
#segmentation

CALHippo - Mapping neurons and glial cells in the human brain hippocampus in 3D using SOTA segmentation and density estimation models [R]

Reddit r/MachineLearning · 4d ago

This paper presents CALHippo, a framework for 3D mapping of neurons and glial cells in the human hippocampus using state-of-the-art segmentation and density estimation models.

0 favorites 0 likes
#segmentation

MAOAM: Unified Object and Material Selection with Vision-Language Models

Hugging Face Daily Papers · 2026-06-02 Cached

This paper presents MAOAM, a unified vision-language model framework that enables precise object and material selection through text or click interactions for interactive image editing. It introduces a scalable data generation pipeline and shows emergent improvement when combining text and clicks at inference.

0 favorites 0 likes
#segmentation

One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation

Hugging Face Daily Papers · 2026-05-28 Cached

Group Prompting introduces a training-free framework for cell instance segmentation that requires only one click per cell type, using the Segment Anything Model's feature space to recursively expand prompts, achieving competitive performance without training.

0 favorites 0 likes
#segmentation

InstructSAM: Segment Any Instance with Any Instructions

Hugging Face Daily Papers · 2026-05-25 Cached

InstructSAM presents a unified framework for multi-instance segmentation using instruction-driven queries that bridge vision-language models and SAM3, achieving strong results across complex benchmarks.

0 favorites 0 likes
#segmentation

Semantic Generative Tuning for Unified Multimodal Models

Hugging Face Daily Papers · 2026-05-18 Cached

Introduces Semantic Generative Tuning (SGT), a paradigm that uses image segmentation as a generative proxy to align visual understanding and generation in unified multimodal models, improving both comprehension and fidelity.

0 favorites 0 likes
#segmentation

AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting

Hugging Face Daily Papers · 2026-05-14 Cached

AuralSAM2 integrates audio into SAM2 via an AuralFuser module that generates sparse and dense prompts from audio-visual features, enhancing cross-modal segmentation while maintaining interactive efficiency.

0 favorites 0 likes
#segmentation

From Pixels to Concepts: Do Segmentation Models Understand What They Segment?

Hugging Face Daily Papers · 2026-05-10 Cached

Introduces CAFE, a benchmark for evaluating whether promptable segmentation models truly understand concepts by using counterfactual attribute manipulation, revealing that accurate mask prediction does not guarantee faithful semantic grounding.

0 favorites 0 likes
#segmentation

TwinTrack: Post-hoc Multi-Rater Calibration for Medical Image Segmentation

Hugging Face Daily Papers · 2026-04-17 Cached

TwinTrack is a post-hoc calibration framework for pancreatic cancer segmentation that aligns ensemble model probabilities with the empirical mean human response across multiple annotators, improving interpretability and calibration metrics on multi-rater benchmarks.

0 favorites 0 likes
#segmentation

Falcon Perception

Hugging Face Blog · 2026-04-01 Cached

Falcon Perception is a 0.6B-parameter early-fusion Transformer model released by TII UAE for open-vocabulary grounding and segmentation from natural language prompts, utilizing hybrid attention and specialized heads.

0 favorites 0 likes
← Back to home

Submit Feedback