We gave 45 psychological questionnaires to 50 LLMs. What we found was not “personality.”
Summary
Researchers analyzed 50 LLMs across 45 psychometric questionnaires, identifying a 'Pinocchio Dimension' that measures how models endorse inner experiences rather than reflecting true personality traits.
Similar Articles
Human Psychometric Questionnaires Mischaracterize LLM Behavior
This paper finds that human psychometric questionnaires fail to reliably predict LLM behavior in real-world interactions, and proposes generation-based profiling as a more accurate alternative.
Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?
This paper investigates whether fine-tuning LLMs on long-form essays with associated Big Five personality profiles stabilizes questionnaire responses and can induce target profiles, finding that while variance reduces, accuracy on the full five-dimensional profile remains near chance.
I made a quiz that tells you which LLM you align with most, based on personality and values research across 15 models [R]
A quiz that matches users to the LLM that aligns most with their personality and values, based on research across 15 models.
Rethinking Psychometric Evaluation of LLMs: When and Why Self-Reports Predict Behavior
This paper examines when and why self-reported psychometric measures predict the actual behavior of large language models, finding that fine-grained, behavior-specific instruments (Theory of Planned Behavior) achieve human-level coherence within a shared conversation, while broad traits like Big 5 do not.
Evaluating LLMs as Human Surrogates in Controlled Experiments
This paper evaluates whether off-the-shelf LLMs can reliably simulate human responses in controlled behavioral experiments by comparing LLM-generated data with human survey responses on accuracy perception. The findings show that while LLMs capture directional effects and aggregate belief-updating patterns, they do not consistently match human-scale effect magnitudes, clarifying when synthetic LLM data can serve as behavioral proxies.