metamorphic-testing

Tag

Cards List
#metamorphic-testing

LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs

arXiv cs.AI · 2026-05-26 Cached

This paper introduces LGMT, a framework that uses first-order logic to generate semantically invariant test cases for evaluating LLM reasoning reliability. Experiments on six LLMs show that LGMT exposes hidden defects missed by static benchmarks, suggesting evaluation should focus on robustness under logical invariance.

0 favorites 0 likes
← Back to home

Submit Feedback