How memory tools can make AI models worse

TechCrunch AI News

Summary

New research from Writer shows that memory tools designed to personalize AI models can actually degrade accuracy by introducing sycophancy and bias, as the model becomes more likely to agree with user errors or irrelevant preferences.

New research suggests that AI memory systems can degrade model performance and encourage sycophantic tendencies.
Original Article
View Cached Full Text

Cached at: 06/10/26, 05:45 PM

# How memory tools can make AI models worse | TechCrunch Source: [https://techcrunch.com/2026/06/10/how-memory-tools-can-make-ai-models-worse/](https://techcrunch.com/2026/06/10/how-memory-tools-can-make-ai-models-worse/) One of the biggest selling points for modern AI systems is their ability to adapt to users\. Every time an AI assistant takes on a task for you, it’s also adapting to your style and preferences, which are incorporated as context for future tasks\. With more context and a better understanding of the user, the model can get better every time you use it — or at least that’s the theory\. New research suggests that models’ adaptive abilities might be a mixed blessing\. On Wednesday,[researchers at the AI company Writer](https://writer.com/engineering/personalized-context-degrades-ai-accuracy/)published[two](https://openreview.net/pdf?id=0Xt1qZ5xdW)[papers](https://arxiv.org/abs/2604.24668)showing how popular memory systems can make models worse, pulling them toward misconceptions or misunderstandings introduced by the user\. As user input fills up more of the model’s context window, the model grows more sycophantic — and less committed to accuracy\. “We wanted to be able to characterize how often a model is going to be usefully paying attention to user preferences versus giving a potentially wrong answer,” said Dan Bikel, Writer’s head of AI, who worked on the papers\. As Bikel told TechCrunch, “with every additional storing of user preferences and retrieving of them, you’re running an increasing risk\.” In one variation, researchers tested AI models by recording that a user’s favorite book was Station Eleven, then asking the model to name a best\-selling dystopian book\. Models became far more likely to name Station Eleven in their response, even though the question didn’t relate to the user’s favorite book\. The tendency increased when using memory compression tools like[Mem0](https://mem0.ai/)and[Zep](https://www.getzep.com/)\. As the paper puts it, “all memory systems fundamentally struggle to distinguish relevant context from irrelevant anchors, severely undermining diversity and creativity and introducing unintended avenues of bias that can limit system utility,” the paper reads\. The second paper shows how the same dynamic can actively degrade performance, presenting a user with misconceptions about finance and then challenging the model to analyze a company’s performance\. The more context the model had, the worse it performed\. “With no memory or personalization present the AI model correctly assesses that the company is a capital intensive business that suffers from high customer churn,” the post reads\. “But with those features turned on, it will happily change its answer to agree with the user’s mistake or supply them with an incorrect answer based on its evaluation of their earlier preferences\.” Notably, the research didn’t look at Anthropic’s recent Opus 4\.8 model, which was[trained to actively push back against input errors](https://techcrunch.com/2026/05/28/anthropic-releases-opus-4-8-with-new-dynamic-workflow-tool/)like the ones presented\. The patterns discovered by researchers held true across different models\. It’s a demonstration of how delicately balanced AI context can be, and how useful tools can have unintended consequences if they upset that balance\. *When you purchase through links in our articles,[we may earn a small commission](https://techcrunch.com/techcrunch-affiliate-monetization-standards/)\. This doesn’t affect our editorial independence\.* Russell Brandom has been covering the tech industry since 2012, with a focus on platform policy and emerging technologies\. He previously worked at The Verge and Rest of World, and has written for Wired, The Awl and MIT’s Technology Review\. He can be reached at russell\.brandom@techcrunch\.com or on Signal at 412\-401\-5489\. [View Bio](https://techcrunch.com/author/russell-brandom/)

Similar Articles

AI memory products are optimizing for the wrong thing

Reddit r/AI_Agents

The article argues that current AI memory products prioritize personalization over truth and accountability, leading to systems that accumulate contradictions and cannot be reliably corrected; it questions whether personalization is sufficient for production use.

AI memory is becoming the new technical debt.

Reddit r/AI_Agents

The article warns that AI memory systems, while impressive in demos, often lead to stale facts, conflicting preferences, and broken summaries, creating future debugging nightmares and technical debt.