vlm

Tag

Cards List
#vlm

@jerryjliu0: ParseBench is the first benchmark to include VLM chart understanding over enterprise documents. Existing benchmarks (Ch…

X AI KOLs Timeline · 2026-04-21 Cached

ParseBench introduces the first benchmark evaluating vision-language models on chart comprehension within full enterprise documents, addressing gaps in prior chart-only benchmarks.

0 favorites 0 likes
#vlm

@nomadicai: The future of computer vision is agentic. 1/ We built Nomadic around a gap we kept seeing in video understanding: VLMs …

X AI KOLs Following · 2026-04-21 Cached

NomadicAI is building an agentic computer-vision product to fix VLMs' weak grounding in actual video content.

0 favorites 0 likes
#vlm

@jerryjliu0: A downside with using VLMs to parse PDFs is guaranteeing that the output text is *correct* and output in the correct re…

X AI KOLs Following · 2026-04-18 Cached

Jerry Liu discusses challenges with using Vision Language Models for PDF parsing, particularly around ensuring text correctness and maintaining proper reading order while avoiding hallucinations.

0 favorites 0 likes
#vlm

PersonaVLM: Long-Term Personalized Multimodal LLMs

Hugging Face Daily Papers · 2026-03-20 Cached

PersonaVLM introduces a personalized multimodal LLM framework that enables long-term user adaptation through memory retention, multi-turn reasoning, and response alignment, outperforming GPT-4o by 5.2% on the new Persona-MME benchmark.

0 favorites 0 likes
← Back to home

Submit Feedback