Creating and Evaluating Personas Using Generative AI: A Scoping Review of 81 Articles
Summary
This scoping review analyzes 81 articles (2022-2025) examining the use of generative AI for creating and evaluating user personas, identifying strengths in reproducibility but critical issues including lack of evaluation in 45% of studies, over-reliance on GPT models (86%), and risks of circularity where the same model generates and evaluates personas.
View Cached Full Text
Cached at: 04/20/26, 08:32 AM
# Creating and Evaluating Personas Using Generative AI: A Scoping Review of 81 Articles Source: https://arxiv.org/html/2504.04927 (2026) ###### Abstract. As generative AI (GenAI) is increasingly applied in persona development to represent real users, understanding the implications and limitations of this technology is essential for establishing robust practices. This scoping review analyzes how 81 articles (2022-2025) use GenAI techniques for the creation, evaluation, and application of personas. The articles exhibited good level of reproducibility, with 61% of articles sharing resources (personas, code, or datasets). Furthermore, conversational persona interfaces are increasingly provided alongside traditional profiles. However, nearly half (45%) of the articles lack evaluation, and the majority (86%) use only GPT models. In some articles, GenAI use creates a risk of circularity, in which the same GenAI model both generates and evaluates outputs. Our findings also suggest that GenAI seems to reduce the role of human developers in the persona-creation process. To mitigate the associated risks, we propose actionable guidelines for the responsible integration of GenAI into persona development. Generative AI, LLM, Personas, Persona Development ††journalyear:2026 ††copyright:cc ††conference:Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems; April 13–17, 2026; Barcelona, Spain ††booktitle:Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI ’26), April 13–17, 2026, Barcelona, Spain ††doi:10.1145/3772318.3790608 ††isbn:979-8-4007-2278-3/2026/04 ††ccs:Human-centered computing → HCI theory, concepts and models ††ccs:Human-centered computing → User models ## 1. Introduction User personas (‘personas’ for short) are fictitious characters representing archetypal real users of a system, product, or service presented in a humanized form (Cooper, 1999). Since their introduction into human-computer interaction (HCI), personas have been adopted to support user-centered design (UCD) in domains such as software development (Blomquist and Arvola, 2002; Aoyama, 2005; Adlin et al., 2001), design (Lee et al., 2010; Duda, 2018; Chen et al., 2011), and healthcare (Hendriks et al., 2013; Högberg et al., 2008; Gonzalez de Heredia et al., 2018). Personas help system designers focus on user needs throughout the design process by providing empathetic and memorable representations of core user segments (Pruitt and Adlin, 2010). As such, personas are a critical component of HCI research, user experience (UX) studies, and UCD of systems and technology. Persona development methods were originally based on qualitative data collection with manual analysis, which can be time-consuming, resource-intensive, and prone to staling (Salminen et al., 2021). However, generative artificial intelligence (GenAI), particularly large language models (LLMs), creates opportunities for automating persona development (Huang et al., 2024; Shin et al., 2024; Jung et al., 2025). Automatic persona development is the ability to generate personas without (or minimal) human involvement and can overcome the challenges of manual persona development. This is why GenAI is being researched to support persona development. These GenAI technologies can be applied to various stages of persona development, and the personas created by such a process are called GenAI personas, which will be the focus of this study (see Figure 1 for examples). Although various articles have applied GenAI in persona development (Salminen et al., 2024a; Shin et al., 2024; Schuller et al., 2024; Jung et al., 2025), there is limited systematic synthesis of how GenAI technologies are applied in persona development practices. This lack of knowledge raises serious questions for the HCI community regarding the use of GenAI for persona creation.   **Figure 1.** Examples of GenAI personas. The persona in (a) is generated by Survey2Persona (Jung et al., 2025), (b) is generated by Persona-L (Sun et al., 2025). These systems illustrate that LLMs are increasingly used for generating persona profiles, but also to afford decision-makers access to dialogue with the personas. The figures are annotated to explain different components of GenAI personas. The reader is encouraged to zoom in for better readability. Two AI-generated persona examples. Panel A shows a persona profile for a young woman named after Survey2Persona system, displaying demographic information, personality traits, and behavioral characteristics in a structured format with percentages and data visualizations. Panel B shows a Persona-L generated interface featuring a professional woman's profile with chat functionality, demonstrating how LLMs enable both persona profiles and dialogue access for decision-makers. Moreover, the application of GenAI in persona development suffers from a lack of standardization and best practices; there is yet no clear consensus on optimal approaches. The application of GenAI in persona development is currently uncharted territory, making it difficult for persona developers to make informed decisions about adopting GenAI technologies. The evolution of GenAI technologies has also increased the challenges faced by persona developers, as discussed in detail by Amin et al. (Amin et al., 2025). Similar to the development of data-driven personas (see a review by Salminen et al., 2021), we can observe varying methodological choices and technical implementations. These choices involve a variety of factors, such as the selection of LLM family (e.g., GPT or Claude or Llama), versions of LLMs (e.g., GPT 3.5 or GPT 4.0), hyperparameters (i.e., parameters that impact the behavior of the model; Jansen et al., 2021c), and prompts (i.e., instructions given to the LLM to perform certain tasks). Although researchers have suggested guidelines for persona development in general (Häyhänen et al., 2025; Jansen et al., 2021a; Cooper, 1999; Nielsen et al., 2015; Pruitt and Grudin, 2003), how researchers specifically use GenAI in persona development remains unclear. Another domain that is prone to challenges is the evaluation of GenAI personas. Traditional persona validation methods generally focused on user perceptions of personas (Jansen et al., 2021b; Salminen et al., 2020b), but GenAI personas require additional consideration of factors, such as output consistency, prevention of hallucinations (the ability of LLMs to fabricate information), and prompt reliability (Sattele and Carlos Ortiz, 2024; Lazik et al., 2025). Moreover, integrating GenAI raises questions about evaluating the quality of GenAI-crafted versus human-crafted personas (Amin et al., 2025). Evaluation approaches vary widely, from automated metrics to user studies, without a clear understanding of how researchers validate and evaluate GenAI personas. The ethical aspects of using GenAI technologies may extend beyond traditional concerns about persona development. Though prior work has addressed issues of stereotyping and representation in personas (Turner and Turner, 2011; Goodman-Deane et al., 2018, 2021), it appears that GenAI amplifies certain challenges, such as algorithmic bias, data privacy, and transparency (Amin et al., 2025; Gupta et al., 2024; Prpa et al., 2024; Hämäläinen et al., 2023). These challenges may become particularly acute as personas represent diverse user groups and influence design decisions that affect these populations, prompting the need to understand how researchers factor in ethical concerns when using GenAI in persona development. Against the backdrop of these research challenges, we put forth three research questions (RQs): - • RQ1: How are GenAI technologies used in persona development? - • RQ2: How are GenAI personas evaluated? - • RQ3: What ethical considerations are associated with GenAI personas? Building on persona research in HCI (Pruitt and Adlin, 2010; Salminen et al., 2021) and following guidelines for literature reviews in the HCI field (Kitchenham and Charters, 2004), this scoping review analyzes 81 articles published between 2022 and 2025 on the application of GenAI for persona development. A scoping review generally maps the current scope, key concepts, and knowledge gaps (Arksey and O’Malley, 2005), and is suitable for studying GenAI personas given this domain’s rapidly evolving nature. To this end, we identify trends, gaps, and opportunities, and we provide recommendations to practitioners concerning the integration of GenAI into persona development workflows. Considering the proliferation of GenAI tools and their application in user research and design processes, this work addresses a timely and relevant topic for the HCI community. ## 2. Previous Reviews and Research Gap In UX/HCI practice, personas serve as shared reference points that help designers empathize with users and make decisions aligned with user needs (Pruitt and Adlin, 2010). Persona development traditionally involves collecting user data through research methods (interviews, surveys, analytics), identifying distinct user segments through analysis, and crafting humanized narrative profiles that represent these segments (Cooper, 1999; Salminen et al., 2021). At first glance, GenAI appears to strengthen automatic persona generation by exhibiting capability in most of these above-mentioned tasks. Automatic persona development has existed in the HCI domain for more than a decade (Jung et al., 2018, 2017; Mijač et al., 2018). However, the limitations of automated personas have been apparent, especially when analyzing text records of users and writing up persona narratives (Salminen et al., 2023), which is a key activity in many persona creation approaches (Nielsen, 2019). Algorithms and models prior to LLMs struggled with interpretive tasks, such as writing persona narratives or labeling user segments (Jansen et al., 2021a), which has limited the potential for automatic persona generation, as extensive manual interventions were required in conjunction with data science methodologies (Jansen et al., 2020; Mijač et al., 2018). Due to the limited interpretative capabilities of previous AI models, researchers have applied rule-based systems, such as dynamic templates with predefined fields for dynamically inserted information (Nielsen et al., 2015). In contrast, LLMs’ interpretative capabilities offer possible solutions to the major challenges of contextually interpreting and summarizing textual data about people (Jung et al., 2025). However, questions remain and the best practices of LLMs in persona development remain unknown, including the ideal techniques and processes to follow (Salminen et al., 2024a). There are also practical hurdles; for example, there is no clear concept of human-AI collaboration in persona development, even though researchers seem to agree on the need to maintain “human in the loop” instead of forfeiting all decision-making to GenAI (Shin et al., 2024). Interestingly, human in the loop is a feat that is also considered valuable in persona theory, such that involving stakeholders that are intended to use the personas in the persona creation process supports the development of favorable attitudes toward the persona technique (Neate et al., 2019; Fuglerud et al., 2020). Additionally, it is not evident how well HCI researchers are aware of the risks and challenges associated with implementing this GenAI for developing personas, including ethical concerns such as reinforcing existing systemic biases or marginalizing underrepresented groups when lacking real data on cultural contexts (Amin et al., 2025). These lingering concerns call for a literature review that investigates existing practices, methodologies, and emerging patterns in GenAI persona development. To this end, reviews of persona research have addressed quantitative personas (Salminen et al., 2020a), data-driven personas (Salminen et al., 2021), personas for social impact (Guan et al., 2023), and persona design applications (Salminen et al., 2022), among other areas. Although these reviews provide valuable insights into persona development methods, they do not specifically address the integration of GenAI technologies into persona development. We will discuss in Section 5 how our findings are positioned in the continuum of persona research, specifically in conjunction with previous reviews. Given the growing adoption of GenAI in persona development (Shin et al., 2024; Hämäläinen et al., 2023; Salminen et al., 2024a; Choi et al.,
Similar Articles
Benchmarked Yet Not Measured -- Generative AI Should be Evaluated Against Real-World Utility
This paper argues that Generative AI evaluation should shift from static benchmarks to measuring real-world utility and human outcomes. It introduces the SCU-GenEval framework and supporting instruments to address the disconnect between benchmark performance and deployment success.
Creating next-gen characters
Article discusses using GPT-3 to create advanced AI-powered characters for applications, likely in gaming, interactive media, or virtual environments.
Uneven Evolution of Cognition Across Generations of Generative AI Models
This paper introduces a psychometric framework and the AIQ Benchmark to evaluate the cognitive profiles of generative AI models, revealing uneven evolution with strong verbal skills but stagnant perceptual reasoning.
Measuring What Matters: Benchmarking Generative, Multimodal, and Agentic AI in Healthcare
This paper presents a structured framework for benchmarking generative, multimodal, and agentic AI in healthcare, addressing the gap between high benchmark scores and real-world clinical reliability, safety, and relevance.
Most “agentic AI” conversations feel too abstract. Here is how my agentic research system looks like
The author shares a practical breakdown of an agentic research system they built to identify and evaluate AI use cases within companies. The system uses six agents for discovery, evaluation, and context extraction, emphasizing human-in-the-loop decision-making over full autonomy.