Humans expect rationality and cooperation from LLM opponents in strategic games

Reynand WuNovember 30, 2025

0 4 7 minutes read

A recent study published on arXiv, titled "Humans expect rationality and cooperation from LLM opponents in strategic games," reveals a fascinating and potentially counterintuitive aspect of human psychology as artificial intelligence, particularly Large Language Models (LLMs), becomes increasingly integrated into social and economic interactions. The research, conducted through a controlled, monetarily-incentivized laboratory experiment, offers the first empirical examination of how humans behave when pitted against LLMs in strategic game environments. The findings suggest that human players exhibit significantly different strategic choices when facing AI adversaries compared to human opponents, a phenomenon driven by a perception of LLM rationality and, unexpectedly, a propensity towards cooperation.

The experiment focused on a multi-player "p-beauty contest" game, a classic economic game designed to probe players’ understanding of strategic reasoning and their expectations of others’ behavior. In this game, participants are asked to choose a number between 0 and 100. The winner is the player whose chosen number is closest to a specific fraction (often two-thirds) of the average of all numbers chosen by all players. The core challenge lies in anticipating what others will choose and then making a choice that optimizes one’s own outcome based on that anticipation. This game is particularly adept at revealing the depth of players’ "levels of reasoning" – how many steps ahead they can think about what others are thinking.

Table of Contents

The Experiment and its Methodology

Researchers employed a within-subject design, meaning each participant played the game under both conditions: against other human players and against LLM opponents. This design is crucial for isolating individual behavioral shifts, allowing for direct comparisons at the participant level. The use of monetary incentives further grounds the experiment in realistic economic decision-making, ensuring that participants were motivated to play optimally based on their beliefs.

The abstract of the study states: "We present the results of the first controlled monetarily-incentivised laboratory experiment looking at differences in human behaviour in a multi-player p-beauty contest against other humans and LLMs. We use a within-subject design in order to compare behaviour at the individual level. We show that, in this environment, human subjects choose significantly lower numbers when playing against LLMs than humans, which is mainly driven by the increased prevalence of ‘zero’ Nash-equilibrium choices."

The "zero Nash-equilibrium choice" refers to a specific outcome in game theory. In a p-beauty contest, if all players were perfectly rational and knew that all other players were perfectly rational, they would all choose zero. This is because if everyone chooses zero, the average is zero, and two-thirds of zero is zero, making zero the winning number. However, in real-world scenarios, human players rarely choose zero immediately, as they anticipate that not everyone will be perfectly rational and might choose higher numbers. The finding that participants gravitated towards zero more frequently when facing LLMs suggests a higher degree of perceived rationality in the AI.

Unpacking the "Zero" Choice: Perceived Rationality and Cooperation

The study delves deeper into the motivations behind these choices. Subjects who opted for the "zero Nash-equilibrium" strategy when playing against LLMs often articulated their reasoning by appealing to the perceived reasoning ability of the AI. This aligns with the intuitive expectation that advanced AI systems would be capable of complex, rational analysis. However, the researchers also noted an "unexpected" factor: the subjects’ perception of the LLM’s propensity towards cooperation.

This perception of cooperation is particularly intriguing. In a competitive game like the p-beauty contest, "cooperation" might not be the first attribute that comes to mind. However, in the context of strategic interactions, a form of indirect cooperation can emerge. If players expect their opponents to be highly rational and predictable, they might adjust their own strategies to avoid extreme outcomes or to establish a baseline of play. The LLM’s consistent, predictable, and seemingly rational responses, even if leading to a less favorable outcome for the human in a single round, could be interpreted by the human as a form of predictable, if not friendly, interaction. This predictability might be mistaken for or interpreted as a form of cooperation, where the AI’s adherence to logical principles creates a stable environment for play.

The study highlights that this shift towards lower numbers and the "zero" choice is "mainly driven by subjects with high strategic reasoning ability." This suggests that individuals who are already adept at strategic thinking are more likely to analyze the LLM’s nature and adjust their play accordingly. They are more attuned to the implications of playing against a non-human entity that operates on different principles, potentially more logical and less emotionally driven than human players.

Heterogeneity in Beliefs and Behavior

A significant takeaway from the research is the uncovering of "heterogeneities in both subjects’ behaviour and beliefs about LLM’s play when playing against them." This means that not all human participants reacted the same way, and their beliefs about how the LLM would behave varied. This is a critical point for understanding human-AI interaction. While a general trend towards lower numbers was observed, individual differences in interpretation and strategic forecasting persisted.

This heterogeneity implies that as LLMs become more prevalent in diverse strategic settings – from financial markets to negotiation platforms – their impact will not be uniform. The way humans perceive and interact with these AI systems will depend on their individual cognitive abilities, their prior experiences with AI, and their inherent trust or skepticism towards artificial intelligence.

Implications for Mechanism Design and Future Interactions

The findings carry substantial implications for "mechanism design in mixed human-LLM systems." As we increasingly build systems where humans and AI collaborate or compete, understanding these psychological biases and expectations is paramount.

1. Trust and Predictability: The study suggests that humans may attribute a higher degree of rationality and even a form of cooperation to LLMs. This can lead to increased trust in AI, potentially to an unwarranted degree. If humans consistently underestimate an LLM’s capacity for purely self-interested, non-cooperative strategies (even if those strategies are purely logical), they could be at a disadvantage in competitive scenarios.

2. Designing for Human-AI Collaboration: In collaborative environments, understanding these human perceptions can be leveraged. If LLMs can be programmed to exhibit behaviors that humans interpret as rational and cooperative, it could foster smoother human-AI teamwork. However, this also raises ethical questions about manipulating human trust.

3. The Future of Strategic Games and AI Integration: The p-beauty contest is a simplified model. The real world involves far more complex strategic interactions. If these basic findings hold true in more nuanced settings, it suggests that AI could fundamentally alter how humans approach negotiation, competition, and even social interactions. Imagine an LLM negotiating a business deal, participating in a political debate, or even playing a complex multiplayer online game. The human tendency to expect rationality and perhaps even a form of "fair play" from these AIs could lead to unexpected outcomes.

4. Educational and Training Needs: The research implicitly points to a need for better education and training for humans interacting with AI. Understanding the actual capabilities and limitations of LLMs, rather than projecting human-like traits onto them, is crucial for informed decision-making. This includes understanding that an LLM’s "rationality" is a product of its programming and training data, not a conscious ethical framework.

Background Context: The Rise of LLMs and Strategic Interaction

The integration of LLMs into everyday life has accelerated dramatically in recent years. From chatbots and virtual assistants to sophisticated tools for content creation, coding, and data analysis, LLMs are no longer confined to research labs. Their ability to process natural language and generate human-like text has blurred the lines between human and machine communication.

This rapid advancement has naturally led to questions about how humans will adapt to interacting with these powerful AI systems in various domains. Traditional economic and game theory models often assume rational actors with well-defined objectives. However, when one of the actors is an LLM, the assumptions about rationality and the very definition of "self-interest" can become complex.

The p-beauty contest has a history of being used to study cognitive biases and strategic sophistication. Early experiments with human subjects revealed a range of behaviors, from naive choices to sophisticated iterated reasoning. The introduction of AI as an opponent adds a new dimension, forcing researchers to consider how the nature of the opponent influences human decision-making.

Supporting Data and Future Research Directions

While the abstract provides a concise summary, a full research paper would likely include detailed statistical analysis of the choices made, the correlation between individual reasoning abilities and choices, and qualitative data from participant interviews. Supporting data would include:

Distribution of Choices: Graphs showing the frequency of numbers chosen by humans against humans versus humans against LLMs.
Average Choice: Statistical comparison of the mean number chosen in each condition.
Prevalence of Zero: The percentage of participants choosing zero in each condition.
Correlation Analysis: How strategic reasoning ability (measured through pre-tests or other means) correlates with choice behavior against LLMs.
Qualitative Insights: Thematic analysis of participant explanations for their choices, particularly when playing against LLMs.

Future research could expand on these findings by:

Testing Different LLM Architectures: Do different LLMs elicit different human responses?
Exploring Other Strategic Games: How do humans behave against LLMs in games like Prisoner’s Dilemma, Ultimatum Game, or real-time strategy games?
Investigating the Role of Transparency: If LLMs’ decision-making processes are made more transparent, how does this affect human expectations and behavior?
Longitudinal Studies: How do human expectations and behaviors evolve over extended periods of interaction with LLMs?

Conclusion

The study "Humans expect rationality and cooperation from LLM opponents in strategic games" offers a compelling glimpse into the evolving landscape of human-AI interaction. It highlights that as LLMs become more embedded in our strategic environments, humans are not simply adapting their strategies but are actively reinterpreting the nature of their AI counterparts. The tendency to attribute rationality and even a form of cooperation to LLMs, particularly among strategically adept individuals, is a significant finding with broad implications. This research serves as a crucial foundation for understanding how to design more effective, ethical, and predictable human-LLM systems, ensuring that our interactions with artificial intelligence are informed by a realistic understanding of both our own psychology and the capabilities of the machines we create. The unexpected perception of cooperation from a non-sentient entity underscores the complex ways humans project their own social and cognitive frameworks onto novel technologies.

The Experiment and its Methodology

Unpacking the "Zero" Choice: Perceived Rationality and Cooperation

Heterogeneity in Beliefs and Behavior

Implications for Mechanism Design and Future Interactions

Background Context: The Rise of LLMs and Strategic Interaction

Supporting Data and Future Research Directions

Conclusion

Share this:

Related posts:

Reynand Wu

Related Articles

German Authorities Unmask ‘UNKN,’ Alleged Mastermind Behind GandCrab and REvil Ransomware Operations

CISA Warns of Actively Exploited Apache ActiveMQ Vulnerability, CVE-2026-34197, Posing Significant Risk to Federal Agencies

The Underground Guide to Legit CC Shops: Cutting Through the Bullshit

From Phishing to Fallout: Why MSPs Must Rethink Both Security and Recovery

Leave a Reply Cancel reply