metric-learning

Tag

Cards List
#metric-learning

Beyond Scalar Distances: Semantic Attribute Gradients from Frozen MLLMs for Visual Embeddings

Hugging Face Daily Papers · 2026-06-13 Cached

SAGA framework uses frozen multimodal large language models to provide attribute-aware supervision for vision encoders via Group Relative Policy Optimization, improving zero-shot image retrieval by 3–6 points on fine-grained benchmarks.

0 favorites 0 likes
← Back to home

Submit Feedback