temporal-reasoning

#temporal-reasoning

Why do AI chats still feel so bad at handling time?

Reddit r/AI_Agents ↗ · 2026-07-09

This article discusses the persistent difficulty AI chatbots have with correctly handling time-related queries, exploring the underlying reasons and user frustrations.

0 favorites 0 likes

#temporal-reasoning

OpenCoF: Learning to Reason Through Video Generation

Hugging Face Daily Papers ↗ · 2026-07-09 Cached

OpenCoF introduces a reasoning video dataset and a fine-tuned video generation model that improves temporal reasoning through diverse supervision and explicit reasoning tokens, showing significant gains on four video reasoning benchmarks.

0 favorites 0 likes

#temporal-reasoning

From Foundation to Application: Improving VLA Models in Practice

Papers with Code Trending ↗ · 2026-07-07 Cached

This paper presents LingBot-VLA 2.0, which enhances VLA foundation models for robotics by improving generalization across tasks and embodiments, expanding action space to whole-body degrees of freedom, and incorporating predictive dynamics modeling for better temporal reasoning.

0 favorites 0 likes

#temporal-reasoning

A Study of Temporal Fusion Strategies for Named Entity Recognition in Historical Texts

arXiv cs.CL ↗ · 2026-06-29 Cached

This paper systematically studies how temporal metadata can be structurally embedded into named entity recognition (NER) models for historical texts. Experiments with absolute and relative temporal representations injected via early or late fusion mechanisms show that late fusion strategies yield more robust performance on French and German historical datasets.

0 favorites 0 likes

#temporal-reasoning

Overview of HIPE-2026: Person-Place Relation Extraction from Multilingual Historical Texts

arXiv cs.CL ↗ · 2026-06-25 Cached

This paper presents the results of HIPE-2026, the third edition of the HIPE evaluation series, which focuses on temporally grounded person-place relation extraction from multilingual historical documents in French, German, and English. Seventeen participating teams were evaluated on predictive accuracy, computational efficiency, and cross-domain generalization.

0 favorites 0 likes

#temporal-reasoning

The 4 reasons your AI assistant keeps forgetting you (and how we fixed it)

Reddit r/AI_Agents ↗ · 2026-06-10

The article identifies four key flaws in current AI agent memory systems—brittleness, lack of temporal reasoning, forgetting dilemma, and evaluation gap—and presents a novel memory architecture inspired by code agents, achieving high benchmark scores while emphasizing context learning as the next challenge.

0 favorites 0 likes

#temporal-reasoning

Can LLMs Be Constrained to the Past? Improving Knowledge Cutoff through Recall-Based Prompting

arXiv cs.CL ↗ · 2026-06-05 Cached

This paper proposes recall-based prompting strategies (Self-Recall and Question-Recall) to improve LLM knowledge cutoff adherence, outperforming existing methods on counterfactual questions and introducing a Multi-cutoff Historical Event Benchmark (MHEB) for robustness evaluation.

0 favorites 0 likes

#temporal-reasoning

Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA

arXiv cs.CL ↗ · 2026-06-04 Cached

Researchers introduce DoseBench, a benchmark of 81 OTC dosing scenarios to evaluate LLM decision-making under temporal uncertainty for acetaminophen and ibuprofen use. Results show LLMs frequently struggle with rolling-window reasoning and can produce confident but medically unsupported responses.

0 favorites 0 likes

#temporal-reasoning

When and How Long? The Readout-Mediator Angle in Temporal Reasoning

arXiv cs.LG ↗ · 2026-05-29 Cached

This paper introduces the readout-mediator angle to demonstrate that linear probes can decode information from language model activations that is orthogonal to the model's actual causal computation, undermining probe-based interpretability. The finding replicates across model scales and families, revealing a fundamental failure mode in using probes for mechanistic understanding or safety monitoring.

0 favorites 0 likes

#temporal-reasoning

AsyncTool: Evaluating the Asynchronous Function Calling Capability under Multi-Task Scenarios

Hugging Face Daily Papers ↗ · 2026-05-27 Cached

This paper introduces AsyncTool, a benchmark for evaluating LLM-based agents' asynchronous function calling abilities in multi-task scenarios with delayed tool responses. It proposes efficiency-oriented metrics and identifies key failure modes of current tool-using agents.

0 favorites 0 likes

#temporal-reasoning

LiFT: Does Instruction Fine-Tuning Improve In-Context Learning for Longitudinal Modelling by Large Language Models?

arXiv cs.CL ↗ · 2026-04-21 Cached

LiFT is a longitudinal instruction fine-tuning framework that unifies diverse temporal NLP tasks under a shared instruction schema with curriculum-based training. Evaluated across OLMo, LLaMA, and Qwen models, LiFT consistently outperforms base-model in-context learning, especially on out-of-distribution data and rare change events.

0 favorites 0 likes

#temporal-reasoning

Zep: A Temporal Knowledge Graph Architecture for Agent Memory

Papers with Code Trending ↗ · 2025-01-20 Cached

This paper introduces Zep, a temporal knowledge graph architecture for agent memory that outperforms MemGPT in benchmarks like DMR and LongMemEval. It highlights Zep's ability to handle dynamic knowledge integration and temporal reasoning for enterprise use cases.

0 favorites 0 likes

temporal-reasoning

Submit Feedback