grounding

#grounding

We put a design question to ten models: what’s the best way to reach a correct answer? They didn’t take a side — they prescribed the right tool for each kind of question. RoundTable already had one. So we built the other.

Reddit r/artificial ↗ · yesterday Cached

Ten AI models were asked about the best approach for answering questions; they recommended a council for high-stakes decisions and a grounded fact-checker for factual queries. This led RoundTable to build 'Check mode', a new feature pairing a strong model with a web-grounded fact-checker.

0 favorites 0 likes

#grounding

GAVEL: Grounded Caption Error Verification and Localization

arXiv cs.CL ↗ · 2d ago Cached

GAVEL introduces a new task for verifying, explaining, and localizing errors in image-text pairs, along with a dataset and benchmark. A supervised baseline shows improvements over strong closed-source models.

0 favorites 0 likes

#grounding

Ground Then Rank: Revisiting Knowledge-Based VQA with Training-Free Entity Identification

arXiv cs.CL ↗ · 4d ago Cached

This paper proposes a training-free 'identify-before-answer' (IBA) framework for Knowledge-Based Visual Question Answering (KB-VQA) that decouples entity identification from evidence ranking, outperforming fine-tuned multi-modal retrieval-augmented generation baselines while reducing complexity.

0 favorites 0 likes

#grounding

DiagFlowBench: Evaluating How Language Models Handle Off-Procedure Inputs in Grounded Diagnostic Dialogue

arXiv cs.AI ↗ · 2026-06-17 Cached

This paper introduces DiagFlowBench, a benchmark dataset of 1,676 multi-turn diagnostic conversations derived from industrial flowcharts, designed to evaluate how well language models handle off-procedure inputs and abstain from giving inappropriate advice.

0 favorites 0 likes

#grounding

Seeing Before Reasoning: Decoupling Perception and Reasoning for Shortcut-Resilient Multimodal On-Policy Self-Distillation

Hugging Face Daily Papers ↗ · 2026-06-17 Cached

This paper introduces ViGOS, a method for multimodal on-policy self-distillation that decouples perception and reasoning by having the student model first produce a visual description before reasoning, reducing shortcut reliance and improving image-grounding behavior.

0 favorites 0 likes

#grounding

Thinking with Visual Grounding

Hugging Face Daily Papers ↗ · 2026-06-15 Cached

This paper introduces visually grounded thinking, a method for vision-language models to interleave natural-language reasoning with explicit visual evidence grounding using points or boxes. A scalable synthesis pipeline and grounding-aware reinforcement learning improve reasoning accuracy, enabling a 4B model to match or surpass a 27B model on spatial and counting benchmarks.

0 favorites 0 likes

#grounding

Helping Figures Tell their Story! Paper-Grounded Video Generation Explaining Complex Scientific Figures

arXiv cs.CL ↗ · 2026-06-12 Cached

Introduces MINARD, a pipeline for generating narrated, region-grounded walkthrough videos from scientific figures and their papers, along with the FigTalk benchmark and new grounding metrics.

0 favorites 0 likes

#grounding

Do Vision-Language Models See or Guess? Measuring and Reducing Textual-Prior Reliance with a Phrasing-Controlled Benchmark

arXiv cs.CL ↗ · 2026-06-10 Cached

This paper introduces a phrasing-controlled benchmark to measure how much vision-language models rely on textual priors versus image content. Experiments across eleven models show significant degradation when text leakage is minimized, and the authors demonstrate that in-context learning and GRPO post-training can reduce this reliance.

0 favorites 0 likes

#grounding

Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models

Hugging Face Daily Papers ↗ · 2026-06-10 Cached

Proposes Reroute, a training-free plug-in for vision-language models that replaces irreversible visual-token pruning with recoverable routing, allowing tokens to re-enter the pipeline later to improve grounding under aggressive token reduction while maintaining VQA performance.

0 favorites 0 likes

#grounding

Bootstrapping Semantic Layer from Execution for Text-to-SQL

arXiv cs.CL ↗ · 2026-06-05 Cached

Introduces GATE (Grounding After Test from Execution), a method that bootstraps missing semantic groundings from execution feedback to handle under-specified user phrases in text-to-SQL tasks, consistently improving over strong baselines.

0 favorites 0 likes

#grounding

@DataChaz: NVIDIA just pulled off something crazy: making bounding box detection 10x faster by ripping out the exact step the enti…

X AI KOLs Timeline ↗ · 2026-06-01 Cached

NVIDIA researchers developed a technique to speed up bounding box detection by 10x by eliminating the autoregressive token-by-token prediction step used in VLM grounding models.

0 favorites 0 likes

#grounding

Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

arXiv cs.LG ↗ · 2026-05-29 Cached

This paper demonstrates that training a world model through random physical exploration leads to latent representations that encode spatial semantic structure (direction and position) without any linguistic supervision, highlighting physical geometry as the organizing principle.

0 favorites 0 likes

#grounding

Graph Alignment Topology as an Inductive Bias for Grounding Detection

arXiv cs.CL ↗ · 2026-05-25 Cached

This paper introduces Graph Alignment Topology as an inductive bias for grounding detection, using a graph neural network to model alignment structure between reference information and LLM outputs. The method achieves state-of-the-art results on multiple hallucination and question-answering datasets, outperforming GPT-4o.

0 favorites 0 likes

#grounding

Models Can Model, But Can't Bind: Structured Grounding in Text-to-Optimization

arXiv cs.LG ↗ · 2026-05-22 Cached

This paper introduces Text2Opt-Bench, a scalable benchmark for text-to-optimization, and identifies that LLMs struggle with 'binding' (grounding problem data) rather than 'modeling' (choosing optimization structure). The authors propose BIND, a simple inference-time method that externalizes numeric data, significantly improving accuracy across models.

0 favorites 0 likes

#grounding

Evaluated a RAG chatbot and the most expensive model was the worst performer. Notes on what actually moved the needle.

Reddit r/LocalLLaMA ↗ · 2026-05-15

A detailed evaluation of a RAG customer support chatbot reveals that retrieval issues often masquerade as LLM problems, heuristic evaluators are misleading, deduplication improves quality, stricter grounding trades helpfulness for accuracy, and model sweeping can dramatically reduce cost while improving performance.

0 favorites 0 likes

#grounding

Grounded Continuation: A Linear-Time Runtime Verifier for LLM Conversations

arXiv cs.AI ↗ · 2026-05-15 Cached

This paper introduces Grounded Continuation, a linear-time runtime verifier for LLM conversations that maintains an explicit dependency graph to detect whether a next utterance is supported by prior conversation, achieving accuracy gains over baselines on benchmarks including LongMemEval and LoCoMo.

0 favorites 0 likes

#grounding

Falcon Perception

Hugging Face Blog ↗ · 2026-04-01 Cached

Falcon Perception is a 0.6B-parameter early-fusion Transformer model released by TII UAE for open-vocabulary grounding and segmentation from natural language prompts, utilizing hybrid attention and specialized heads.

0 favorites 0 likes

#grounding

FACTS Grounding: A new benchmark for evaluating the factuality of large language models

Google DeepMind Blog ↗ · 2024-12-17 Cached

DeepMind introduces FACTS Grounding, a comprehensive benchmark with 1,719 examples for evaluating how accurately large language models ground their responses in source material and avoid hallucinations. The benchmark includes a public dataset and an online Kaggle leaderboard tracking LLM performance on factual accuracy and grounding tasks.

0 favorites 0 likes

grounding

Submit Feedback