visual-language-models

#visual-language-models

Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline

arXiv cs.AI ↗ · 2026-06-09 Cached

This paper proposes a large-scale multi-modal dataset (MMIO) for zero-shot industrial defect detection and introduces the Refined Text-Visual Prompt (RTVP) method, achieving state-of-the-art results on the benchmark.

0 favorites 0 likes

#visual-language-models

From Data to Insights: Exploring Program-of-Thoughts Prompting for Chart Summarization

arXiv cs.CL ↗ · 2026-05-29 Cached

This paper introduces a zero-shot strategy for chart summarization using Program-of-Thoughts prompting, where lightweight visual language models (VLMs) generate Python programs to compute statistics, improving factual accuracy over existing methods.

0 favorites 0 likes

visual-language-models

Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline

From Data to Insights: Exploring Program-of-Thoughts Prompting for Chart Summarization

Submit Feedback