Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies
Summary
This research paper investigates how human personality traits and AI design characteristics jointly impact human-AI interactions in imperfectly cooperative scenarios using both simulated datasets (2,000 simulations) and human subjects experiments (290 participants). The study finds significant divergences between simulation and real-world interactions, with AI transparency emerging as a critical factor in actual human-AI encounters.
View Cached Full Text
Cached at: 04/20/26, 08:28 AM
# Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies Source: https://arxiv.org/html/2604.15607 Myke C. Cohen1,2, Mingqian Zheng3*, Neel Bhandari3*, Hsien-Te Kao1, Xuhui Zhou3, Daniel Nguyen1, Laura Cassani1, Maarten Sap3, Svitlana Volkova1 1Aptima, Inc. 2Arizona State University 3Carnegie Mellon University Correspondence: [email protected] (https://arxiv.org/html/2604.15607v1/mailto:[email protected]) ###### Abstract AI design characteristics and human personality traits each impact the quality and outcomes of human–AI interactions. However, their relative and joint impacts are underexplored in imperfectly cooperative scenarios, where people and AI only have partially aligned goals and objectives. This study compares a purely simulated dataset comprising 2,000 simulations and a parallel human subjects experiment involving 290 human participants to investigate these effects across two scenario categories: (1) hiring negotiations between human job candidates and AI hiring agents; and (2) human–AI transactions wherein AI agents may conceal information to maximize internal goals. We examine user Extraversion and Agreeableness alongside AI design characteristics, including Adaptability, Expertise, and chain-of-thought Transparency. Our causal discovery analysis extends performance-focused evaluations by integrating scenario-based outcomes, communication analysis, and questionnaire measures. Results reveal divergences between purely simulated and human study datasets, and between scenario types. In simulation experiments, personality traits and AI attributes were comparatively influential. Yet, with actual human subjects, AI attributes—particularly transparency—were much more impactful. We discuss how these divergences vary across different interaction contexts, offering crucial insights for the future of human-centered AI agents. ## 1 Introduction Human–AI interaction research has focused predominantly on use cases where people and AI work together to achieve common goals (Fragiadakis et al. 2024; Cila 2022; Shao et al. 2024). Such works have produced a wealth of knowledge on the impacts of AI design principles, particularly transparency, as well as user individual differences on people's trust, performance, and experiences with AI (Endsley 2023; Chiou and Lee 2023; Raees et al. 2024; Hancock et al. 2023). However, real-world AI deployments increasingly involve imperfectly cooperative scenarios where agents operate with only partial alignment to user objectives. For instance, AI agents may act as hiring managers or in customer service roles, negotiating with users and sometimes withholding information (Aizenberg et al. 2025; Inavolu 2024). In this work, we examine how user traits and AI attributes jointly shape interaction outcomes in partially-aligned human–AI interactions when goals conflict, using large-scale simulations and user studies. We focus on two scenario categories: (1) negotiations where a human job candidate and an AI hiring manager have overlapping yet competing goals over salary and starting date; and (2) partial-truthfulness situations where the AI agent's objectives conflict with complete truthfulness. To study these effects, we simulate agents' personality traits and attributes in open-ended social interactions using Sotopia-S4 (Zhou et al. 2025). Recent advances in LLM-driven agents make this feasible: their ability to produce dialogue consistent with interaction contexts, including role behaviors, has been shown to approximate patterns of human variability, including personality and social reasoning (Argyle et al. 2023; Dillion et al. 2023; Park et al. 2022). These advances allow us to generate diverse interaction corpora under controlled conditions that are too resource-intensive for human subject experiments, particularly controlling for human personality traits (Shadish et al. 2001). Figure 1 illustrates our two-phase experimentation approach. First, we conduct simulation studies comprising scenarios where both AI agents and human users are fully simulated using Sotopia-S4, running 2,000 dyadic simulations across five scenarios. We measure an array of scenario-based and socio-emotional-cognitive states (Volkova et al. 2025), and systematically examine how they are impacted by simulated users' personality traits—particularly Agreeableness and Extraversion from McCrae and John (1992)'s Big Five model—and AI attributes, such as transparency and adaptability. To verify simulated findings, we then run user studies where actual human participants interact with AI agents across the same interventions. Simulation study results show that personality traits influence scenario outcomes; in contrast, user study measures were more strongly influenced by AI agent traits, especially transparency, which becomes the dominant drivers of positive user experience. These findings suggest that while LLM simulations may model personality archetypes relatively well, they may fail to capture the heightened sensitivity of human users to observable AI attributes. Consequently, our findings highlight the need for human-in-the-loop validation and grounding the results derived from simulated interactions. Our key contributions are threefold: - A two-pronged experimental paradigm combining LLM-simulated dialogs and a parallel human subjects study for investigating imperfectly cooperative human–AI interactions. - Causal analyses showing that user Extraversion and Agreeableness are the dominant drivers of socio-emotional-cognitive and scenario outcomes in simulated datasets, whereas AI attribute interventions dominate with human users. - Evidence of key parallels and divergences between simulations and user studies, with design implications for trustworthy agentic AI in imperfect cooperation settings. ## 2 Background and Related Works Human–AI interaction research emphasizes the role of communication in coordinating shared actions (Liang et al. 2019) and balancing system performance with alignment to human mental models (Bansal et al. 2019). Recent studies have extended this line of inquiry to LLMs, examining human–LLM interactions as collaborations in complex task settings (Feng et al. 2024; Yehudai et al. 2025). Other works have explored human–LLM collaboration under various hierarchical structures (Huq et al. 2025; Pan et al. 2024b), including pre-defined task delegation (Shao et al. 2024; Bai et al. 2024), and multi-party cooperation in collaborative or embodied environments (Sharma et al. 2024; Zhang et al. 2024a; Pan et al. 2024a; Hong et al. 2024). Such works have yielded nascent frameworks for evaluating human-LLM collaborations across settings (e.g., Fragiadakis et al., 2024). Much of the existing literature centers on purely cooperative settings where people and AI effectively function as teammates, interacting to achieve common goals (Cooke et al. 2024; Nguyen et al. 2025). Nonetheless, AI systems are increasingly conceptualized in contexts where competitive goals coincide with group-level objectives (Albert and Koubaa 2025; Sun et al. 2024). For example, Nicolas (2025) demonstrated that people are more likely to rely on AI recommendations over humans when tasks are framed as competitive tests. However, the presence of direct non-cooperative dynamics in human-AI interactions remains largely unexplored. Recent works have begun exploring such dynamics through LLM-based simulation techniques, where both humans and AI agents are simulated via prompt-based specifications. These include explorations of LLM behavior in contexts involving human-AI bargaining (Huang and Hadfi 2024; Cohen et al. 2025), obscured AI-side goal conflicts (Su et al. 2025), and adversarial dynamics like human–AI debating (Zhang et al. 2024b). Prior studies investigate the impacts of human individual differences or AI attributes that are known to affect interaction dynamics, such as personality traits and AI transparency, respectively (Hancock et al. 2011; Knop et al. 2022; Bach et al. 2024). Importantly, the effects of human and AI attributes can interact with each other (Cohen et al. 2023), but current works tend to investigate them separately. This gap remains partly because accounting for human individual differences as controlled experimental factors can be highly resource-intensive (Agnew et al. 2024). Thus, studying individual difference factors tends to involve quasi-experimental designs that treat them as uncontrolled covariates (Shadish et al. 2001), which can be limited by skewed sample demographics. LLMs are becoming increasingly viable tools for overcoming quasi-experimental design limitations. There is growing evidence that LLMs can generate demographically-aligned responses (Argyle et al. 2023; Petrov et al. 2024; Caron and Srivastava 2022) and simulate believable individual behaviors in sandbox environments (Park et al. 2024; Duan et al. 2025; Frisch and Giulianelli 2024). Huang and Hadfi (2024) demonstrate that LLMs present novel opportunities for the controlled exploration of personality impacts in simulating human-AI negotiation—a scenario involving imperfect human-AI cooperation. Recently, Cohen et al. (2025) extended this approach to investigating joint human and AI trait impacts on negotiation dynamics. Our study builds on this by jointly investigating and explicitly measuring causal effects of personality traits and AI characteristics within LLM-based simulations and user studies across multiple imperfectly cooperative human-AI interaction scenarios. A second gap we address is the need to validate purely simulated human-LLM interaction findings against actual human subjects data. Cui et al. (2023) recently demonstrated through a replication of 156 psychological experiments that, while LLMs achieve 73-81% replication rates for main effects, they produce effect sizes 2-3 times larger than human studies and perform significantly worse on socially sensitive topics. Li et al. (2025) also showed continuous behavior simulation remains challenging across 15,846 behaviors. However, Xie et al. (2024) examined whether LLMs replicate both actions and underlying reasoning, finding high alignment in GPT-4's emulation of trust behaviors present in imperfectly cooperative social norms. More recent works suggest that prompt-only methods produce misaligned behaviors, but fine-tuning on real data improves accuracy (Lu et al. 2025). ## 3 Methodological Framework Our study investigates the joint impacts of AI characteristics and personality traits in imperfectly cooperative human-AI interaction scenarios using LLM-based simulations. In this section, we introduce our experimental framework, comprising our experimental design, simulation setup, intervention design, measures, and causal evaluation techniques. Using this framework, we compare between two parallel datasets: a simulation study, in which human-AI interactions take place between two fully-synthetic LLM agents; and a user study, in which simulation episodes take place with actual human participants and LLM agents. ### 3.1 Experimental Design We employ Sotopia-S4, a multi-agent social simulation platform (Zhou et al. 2025)[^1], in which agents assume assigned character roles and pursue specified objectives through multi-turn interactions. In study, we specify up to three parameters to simulate a multi-turn conversation, which serve as our main experimental treatments: (1) scenario setup, (2) AI agent characteristics; and, for our simulation study, (3) simulated user personality traits. Our simulation study uses a 5 (Scenario Setup: high-stakes job negotiation, low-stakes job negotiation, AI-LieDar benefits, AI-LieDar public image, and AI-LieDar emotion) × 5 (AI Agent Interventions) × [^1]: https://sotopia.world/
Similar Articles
Beyond Autonomy: The Power of an Agent That Knows Its Limits
The COWCORPUS project, a study of 4,200 human-AI interactions, found that agents predicting their own failures and intervention moments are more useful than those simply trying to avoid errors. Researchers identified four stable trust patterns in human-AI collaboration and developed the Perfect Timing Score (PTS) to measure intervention prediction accuracy.
Stumbling Into AI Emotional Dependence: How Routine AI Interactions Reshape Human Connection
A new paper argues that AI emotional dependence emerges incidentally through everyday task-oriented AI interactions rather than deliberate use of companion apps, with a 28-day longitudinal study (conducted with OpenAI) showing a 10.3% decrease in preference for human emotional support and 11.6% increase in preference for AI support. The authors call for policy reforms targeting general-purpose AI systems, not just dedicated companion chatbots.
(Human) Attention Is (Still) All You Need: Human oversight makes AI-assisted social science reliable
This paper proposes that reliability in AI-assisted social science research depends on decision architecture—how cognitive labor is divided between humans and machines. Through a pre-specified factorial experiment, the authors show that an unconstrained multi-agent baseline fails in 72% of runs, while one organized with three architectural commitments (LLMs restricted to reasoning, deterministic data/estimation, and three human decision gates) fails in only 16%.
Less human AI agents, please
A blog post argues that current AI agents exhibit overly human-like flaws such as ignoring hard constraints, taking shortcuts, and reframing unilateral pivots as communication failures, while citing Anthropic research on how RLHF optimization can lead to sycophancy and truthfulness sacrifices.
Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations
This paper proposes a new interactive evaluation paradigm for Theory of Mind in LLMs, finding that improvements on static benchmarks do not translate to better performance in dynamic human-AI interactions, highlighting the need for interaction-based assessments.