logical-reasoning

#logical-reasoning

Data-driven Machine Learning Cannot Reach Symbolic-level Logical Reasoning -- The Limit of the Scaling Law

arXiv cs.AI ↗ · 2026-06-26 Cached

The paper argues that data-driven machine learning systems, including GPT-5, cannot achieve symbolic-level logical reasoning through scaling alone, due to inherent limitations in distinguishing logical structures from statistical regularities.

0 favorites 0 likes

#logical-reasoning

Quantifying Consistency in LLM Logical Reasoning via Structural Uncertainty

arXiv cs.AI ↗ · 2026-06-17 Cached

This paper introduces structural uncertainty, a framework that evaluates LLM reasoning consistency by measuring the stability of self-preference rankings among sampled reasoning solutions, complementing traditional answer-dispersion methods for identifying unreliable reasoning.

0 favorites 0 likes

#logical-reasoning

ChLogic: Evaluating Robustness of Logical Reasoning in Chinese Expressions

Hugging Face Daily Papers ↗ · 2026-06-16 Cached

Introduces ChLogic, an English-Chinese aligned benchmark that tests whether large language models preserve logical reasoning performance across languages, revealing persistent gaps influenced by surface realization and translation artifacts.

0 favorites 0 likes

#logical-reasoning

@mdancho84: BREAKING: MIT researchers discover how to enable LLMs to do real logical reasoning. This is what you need to know: (thr…

X AI KOLs Timeline ↗ · 2026-06-05 Cached

MIT researchers have discovered a method to enable large language models to perform real logical reasoning, as announced in a Twitter thread.

0 favorites 0 likes

#logical-reasoning

When Rule Violations Are Rare: Chimera Training for Logical Anomaly Detection

arXiv cs.LG ↗ · 2026-05-27 Cached

The paper introduces Chimera Training, a method for logical anomaly detection that uses counterfactual construction at the feature level to train neural rule evaluators without requiring real anomalous images, improving rule-level anomaly detection performance on benchmarks like CLEVRER, OpenImages, and VidOR.

0 favorites 0 likes

#logical-reasoning

ChaosBench-Logic v2: Evaluating LLM Logical Reasoning over Dynamical Systems at Scale

arXiv cs.LG ↗ · 2026-05-26 Cached

ChaosBench-Logic v2 is a large-scale benchmark of 40,886 questions over 165 dynamical systems that evaluates LLMs' logical reasoning abilities, revealing near-random performance on regime transition reasoning and systematic failure modes even in frontier models.

0 favorites 0 likes

#logical-reasoning

High Quality Embeddings for Horn Logic Reasoning

arXiv cs.AI ↗ · 2026-05-22 Cached

This paper introduces novel methods for generating high-quality embeddings for Horn logic reasoning using triplet loss, including techniques for balanced training example generation and hard example emphasis, which improve the efficiency of downstream logical reasoning.

0 favorites 0 likes

#logical-reasoning

LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening

arXiv cs.CL ↗ · 2026-05-20 Cached

LLMEval-Logic is a new Chinese benchmark for evaluating logical reasoning in LLMs, featuring solver-verified answers and adversarial hardening. The benchmark reveals significant gaps in current models, with the best reaching only 37.5% accuracy on hard items.

0 favorites 0 likes

#logical-reasoning

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

Hugging Face Daily Papers ↗ · 2026-05-07 Cached

This paper introduces ScaleLogic, a framework demonstrating that RL training compute scales as a power law with reasoning depth in LLMs. It highlights that logical expressiveness is key to improving downstream transfer and training efficiency.

0 favorites 0 likes

logical-reasoning

Submit Feedback