Enhancing Metacognitive AI: Knowledge-Graph Population with Graph-Theoretic LLM Enrichment

arXiv cs.AI 05/19/26, 04:00 AM Papers

metacognitive-ai knowledge-graph llm-enrichment graph-theoretic self-directed-knowledge-repair rag benchmark

Summary

MetaKGEnrich is a fully automated pipeline that uses graph metrics to detect knowledge gaps in LLM applications, retrieves web evidence, and improves answer quality by 80-87% across three benchmark datasets.

arXiv:2605.16676v1 Announce Type: new Abstract: Metacognition-the ability to monitor one's own knowledge state, spot gaps, and autonomously fill them--remains largely absent from modern AI. Here, we present MetaKGEnrich, a fully automated pipeline that endows large language model (LLM) applications with self-directed knowledge repair. The system (i) builds knowledge graphs from a seed query, (ii) detects sparse regions via seven graph metrics, (iii) has GPT-4o generate targeted questions, (iv) retrieves web evidence with Tavily and ingests it into Neo4j, and (v) re-answers the query with GraphRAG for GPT-4 to evaluate improvement. Tested on 30 queries from each of three widely-used datasets: Google Research Natural Questions, MS MARCO, and Hot-potQA. MetaKGEnrich improved answer quality in 80% of HotpotQA questions, 87% of Google Research Natural Questions and 83% of MS MARCO questions, while preserving well-supported regions. This proof of concept demonstrates how topological self-diagnosis plus targeted retrieval can advance AI toward humanlike metacognitive learning.

Original Article

View Cached Full Text

Cached at: 05/19/26, 06:35 AM

# Enhancing Metacognitive AI: Knowledge-Graph Population with Graph-Theoretic LLM Enrichment
Source: [https://arxiv.org/abs/2605.16676](https://arxiv.org/abs/2605.16676)
[View PDF](https://arxiv.org/pdf/2605.16676)

> Abstract:Metacognition\-the ability to monitor one's own knowledge state, spot gaps, and autonomously fill them\-\-remains largely absent from modern AI\. Here, we present MetaKGEnrich, a fully automated pipeline that endows large language model \(LLM\) applications with self\-directed knowledge repair\. The system \(i\) builds knowledge graphs from a seed query, \(ii\) detects sparse regions via seven graph metrics, \(iii\) has GPT\-4o generate targeted questions, \(iv\) retrieves web evidence with Tavily and ingests it into Neo4j, and \(v\) re\-answers the query with GraphRAG for GPT\-4 to evaluate improvement\. Tested on 30 queries from each of three widely\-used datasets: Google Research Natural Questions, MS MARCO, and Hot\-potQA\. MetaKGEnrich improved answer quality in 80% of HotpotQA questions, 87% of Google Research Natural Questions and 83% of MS MARCO questions, while preserving well\-supported regions\. This proof of concept demonstrates how topological self\-diagnosis plus targeted retrieval can advance AI toward humanlike metacognitive learning\.

## Submission history

From: Brendan Conway\-Smith \[[view email](https://arxiv.org/show-email/bdc07f1f/2605.16676)\] **\[v1\]**Fri, 15 May 2026 22:32:07 UTC \(762 KB\)

Enhancing Metacognitive AI: Knowledge-Graph Population with Graph-Theoretic LLM Enrichment

Similar Articles

Stepwise Reasoning Enhancement for LLMs via External Subgraph Generation

I built an open-source Knowledge Graph pipeline with hybrid retrieval to improve LLM multi-hop reasoning [P]

AgentKGV: Agentic LLM-RAG Framework with Two-Stage Training for the Fact Verification of Knowledge Graphs

SAGE: Scalable Automated Robustness Augmentation for LLM Knowledge Evaluation

MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models

Submit Feedback

Similar Articles

Stepwise Reasoning Enhancement for LLMs via External Subgraph Generation

I built an open-source Knowledge Graph pipeline with hybrid retrieval to improve LLM multi-hop reasoning [P]

AgentKGV: Agentic LLM-RAG Framework with Two-Stage Training for the Fact Verification of Knowledge Graphs

SAGE: Scalable Automated Robustness Augmentation for LLM Knowledge Evaluation

MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models