Tag
This paper investigates the developmental conditions under which a minimal predictive neural system (a 192-dimensional GRU) can distinguish self-caused changes from world-caused changes, identifying four necessary conditions for agency and introducing a metric called agency gain.
MemAudit is a post-hoc auditing framework for memory-augmented LLM agents that identifies poisoned memories by combining counterfactual influence scores and structural anomaly detection, reducing attack success rates from over 70% to 0% in realistic scenarios.