Tag
LoRi proposes a low-rank distillation framework for implicit chain-of-thought reasoning that aligns teacher and student trajectories in a shared low-rank subspace, improving performance on mathematical reasoning benchmarks.
MIRAGE is a framework for mobile GUI agents that replaces verbose chain-of-thought reasoning with compact continuous latent representations, incorporating a generative world model perspective to predict future screen states before acting. On AndroidWorld and AndroidControl benchmarks, it achieves competitive or superior performance while reducing generated tokens by over 75%.
MedicalBench is a new benchmark for evaluating large language models on medical concept extraction from electronic health records, focusing on implicit reasoning and evidence grounding. It includes 823 expert-annotated examples and shows that current models perform modestly, highlighting the difficulty of extracting implicitly stated medical concepts.
This research examines how deep Transformers with bidirectional masking achieve implicit deductive reasoning comparable to explicit chain-of-thought methods. The study demonstrates that algorithmically aligned models can scale reasoning capabilities across diverse graph topologies and problem widths.