Distilling Answer-Set Programming Rules from LLMs for Neurosymbolic Visual Question Answering

arXiv cs.AI 06/03/26, 04:00 AM Papers

Summary

This paper presents a method for distilling answer-set programming rules from large language models to enhance neurosymbolic visual question answering, showing that only a few examples are needed to generate correct rules.

arXiv:2606.03269v1 Announce Type: new Abstract: Visual Question Answering (VQA) is the task of answering questions about images, requiring the integration of multimodal input and reasoning. Modular approaches that incorporate logic-based representations into the reasoning component offer clear advantages over end-to-end trained systems, particularly in terms of interpretability. However, adapting or extending these representations when task requirements change can place a significant burden on developers. To address this challenge, we present an approach for distilling rules from Large Language Models (LLMs). Our method prompts an LLM to extend an initial VQA reasoning theory, expressed as an answer-set program, to meet new requirements of the task. Examples from VQA datasets guide the LLM, validate the results, and help correct erroneous rules by leveraging feedback from the ASP solver. We demonstrate that our approach is effective across diverse VQA datasets. Notably, only a few examples are needed to elicit correct rules from LLMs. Our experiments suggest that rule distillation from LLMs is a promising alternative to traditional data-driven rule learning approaches. Under consideration in Theory and Practice of Logic Programming (TPLP).

Original Article

View Cached Full Text

Cached at: 06/03/26, 09:43 AM

# Distilling Answer-Set Programming Rules from LLMs for Neurosymbolic Visual Question Answering
Source: [https://arxiv.org/abs/2606.03269](https://arxiv.org/abs/2606.03269)
[View PDF](https://arxiv.org/pdf/2606.03269)

> Abstract:Visual Question Answering \(VQA\) is the task of answering questions about images, requiring the integration of multimodal input and reasoning\. Modular approaches that incorporate logic\-based representations into the reasoning component offer clear advantages over end\-to\-end trained systems, particularly in terms of interpretability\. However, adapting or extending these representations when task requirements change can place a significant burden on developers\. To address this challenge, we present an approach for distilling rules from Large Language Models \(LLMs\)\. Our method prompts an LLM to extend an initial VQA reasoning theory, expressed as an answer\-set program, to meet new requirements of the task\. Examples from VQA datasets guide the LLM, validate the results, and help correct erroneous rules by leveraging feedback from the ASP solver\. We demonstrate that our approach is effective across diverse VQA datasets\. Notably, only a few examples are needed to elicit correct rules from LLMs\. Our experiments suggest that rule distillation from LLMs is a promising alternative to traditional data\-driven rule learning approaches\. Under consideration in Theory and Practice of Logic Programming \(TPLP\)\.

## Submission history

From: Nelson Higuera \[[view email](https://arxiv.org/show-email/82bd9d7c/2606.03269)\] **\[v1\]**Tue, 2 Jun 2026 07:35:31 UTC \(4,544 KB\)

Distilling Answer-Set Programming Rules from LLMs for Neurosymbolic Visual Question Answering

Similar Articles

Neural Module Networks for Visual Question Answering

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

Visual Graph Scaffolds for Structural Reasoning in Large Language Models

@neural_avb: https://x.com/neural_avb/status/2063907440509571354

Learning to reason with LLMs

Submit Feedback

Similar Articles

Neural Module Networks for Visual Question Answering

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

Visual Graph Scaffolds for Structural Reasoning in Large Language Models

@neural_avb: https://x.com/neural_avb/status/2063907440509571354