EmpiriGraph-Psy: A Dataset and LLM Pipeline for Extracting Empirical Relation Graphs from Psychology Abstracts

Hugging Face Daily Papers 06/06/26, 12:00 AM Papers

psychology relation-extraction knowledge-graphs llm-pipeline dataset scientific-extraction empirical-graphs

Summary

This paper introduces variable-centered empirical graph extraction for psychology abstracts, constructing the EmpiriGraph-Psy benchmark dataset of 210 annotated abstracts and a staged LLM pipeline that achieves a macro-F1 of 0.74, outperforming direct extraction methods.

Existing scientific relation extraction benchmarks mainly target domains such as computer science, where entities are tasks, methods, datasets, materials, or metrics. This leaves a gap in variable-oriented empirical fields such as psychology, where findings are expressed as relations among constructs, measurements, interventions, and outcomes. We introduce variable-centered empirical graph extraction, the task of mapping scientific abstracts to typed graphs whose nodes are normalized variables and whose edges represent empirical and hierarchical relations. To support this task, we construct EmpiriGraph-Psy, a benchmark of 210 psychology abstracts annotated by domain-trained annotators with normalized variables, concept hierarchies, empirical relation types, and validation states. We evaluate frontier and open-weight LLMs using both direct extraction and a staged graph-construction pipeline that separates variable extraction, normalization, hierarchy construction, evidence selection, relation extraction, and edge validation. The staged pipeline substantially outperforms direct extraction, with the best configuration achieving a macro-F1 of 0.74. Error analysis shows that moderation relations and concept hierarchies remain the most challenging cases, highlighting the difficulty of extracting higher-order empirical claims and implicit abstraction structure from scientific abstracts.

Original Article

View Cached Full Text

Cached at: 06/10/26, 12:13 AM

Paper page - EmpiriGraph-Psy: A Dataset and LLM Pipeline for Extracting Empirical Relation Graphs from Psychology Abstracts

Source: https://huggingface.co/papers/2606.08362

Abstract

Variable-centered empirical graph extraction maps psychology abstracts to typed graphs with normalized variables and empirical relations, achieving improved performance through staged pipeline approaches.

Existingscientific relation extractionbenchmarks mainly target domains such as computer science, where entities are tasks, methods, datasets, materials, or metrics. This leaves a gap in variable-oriented empirical fields such as psychology, where findings are expressed as relations among constructs, measurements, interventions, and outcomes. We introducevariable-centered empirical graph extraction, the task of mapping scientific abstracts totyped graphswhose nodes arenormalized variablesand whose edges represent empirical and hierarchical relations. To support this task, we construct EmpiriGraph-Psy, a benchmark of 210 psychology abstracts annotated by domain-trained annotators withnormalized variables,concept hierarchies, empirical relation types, and validation states. We evaluate frontier and open-weightLLMsusing both direct extraction and a staged graph-construction pipeline that separates variable extraction, normalization, hierarchy construction, evidence selection, relation extraction, and edge validation. Thestaged pipelinesubstantially outperforms direct extraction, with the best configuration achieving amacro-F1of 0.74.Error analysisshows that moderation relations andconcept hierarchiesremain the most challenging cases, highlighting the difficulty of extracting higher-order empirical claims and implicit abstraction structure from scientific abstracts.

View arXiv page View PDF GitHub0 Add to collection

Get this paper in your agent:

hf papers read 2606\.08362

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.08362 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.08362 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.08362 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

EmpiriGraph-Psy: A Dataset and LLM Pipeline for Extracting Empirical Relation Graphs from Psychology Abstracts

Paper page - EmpiriGraph-Psy: A Dataset and LLM Pipeline for Extracting Empirical Relation Graphs from Psychology Abstracts

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

When Cognitive Graphs Meet LLMs: BDEI Cognitive Pathways for Panic Emotional Arousal Prediction

MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models

Enhancing Metacognitive AI: Knowledge-Graph Population with Graph-Theoretic LLM Enrichment

GraphARC: A Comprehensive Benchmark for Graph-Based Abstract Reasoning

Graphs of Research: Citation Evolution Graphs as Supervision for Research Idea Generation

Submit Feedback

Similar Articles

When Cognitive Graphs Meet LLMs: BDEI Cognitive Pathways for Panic Emotional Arousal Prediction

MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models

Enhancing Metacognitive AI: Knowledge-Graph Population with Graph-Theoretic LLM Enrichment

GraphARC: A Comprehensive Benchmark for Graph-Based Abstract Reasoning

Graphs of Research: Citation Evolution Graphs as Supervision for Research Idea Generation