Psychology of game theory
By restricting the choices that people can make, it becomes possible to investigate the process of decision making in a systematic way. The ways in which a person chooses a path that leads them to a predefined outcome can then be explored in detail. But what do these choices look like?
Within this scenario, two prisoners are confronted by detectives with the possibility to either betray their fellow prisoner, or stay silent. If they both betray each other, they will both spend two years in prison. From the point of view of pure self-interest, participants should choose to betray their fellow prisoner — they then have the possibility to be set free, or at least have a reduced sentence of two years, if their fellow prisoner betrays them while they do not.
However, encouragingly for humanity — people usually behave cooperatively [ 2 ] and say nothing to betray each other. To make matters more interesting, and more complex, the scenario can be repeated multiple times to see how each participant behaves if past behavior can be taken into account. Understanding how, and why, this process takes place is a central topic for human behavior researchers.
If neither swerves, they will crash and, well, die. Needless to say, the games can become much more complex than the two presented above, with each modification offering a different insight into how humans arrive at decisions, and what those decisions look like.
A variety of research has shown how the decision making process within games can be delineated and understood in greater detail thanks to devices like eye trackers, GSR sensors, facial expression analysis, and EEG. One example comes from the Ultimatum game [ 4 ]. The Ultimatum game involves an amount of money that one player can split with another player in the proportion of their choosing — the other player can then accept and the money is divided accordingly between them , or reject and neither receives any money.
The rational choice for the receiver would be to always accept the money, as that is always better than no money, but unsurprisingly, humans operate in different ways. A Theory of Reciprocity. Economics, Computer Science. View 3 excerpts, references background.
Psychological games and sequential rationality. Abstract In psychological games the payoff to each player depends not only on what every player does but also on what he thinks every player believes, and on what he thinks they believe others … Expand.
Common Belief in Rationality in Psychological Games. Belief-dependent motivations and emotional mechanisms such as surprise, anxiety, anger, guilt, and intention-based reciprocity pervade real-life human interaction.
At the same time, traditional game … Expand. Aversion to norm-breaking: A model. Psychology, Computer Science. Incorporating belief-dependent motivation in games. Reciprocity and the Paradox of Trust in psychological game theory.
In this literature, reciprocity is modelled by defining an … Expand. Complete information requires that every player know the strategies and payoffs of the other players but not necessarily the actions.
For obvious reasons, games as studied by economists and real-world game players are generally finished in a finite number of moves. Pure mathematicians are not so constrained, and set theorists in particular study games that last for infinitely many moves, with the winner or other payoff not known until after all those moves are completed. The focus of attention is usually not so much on what is the best way to play such a game, but simply on whether one or the other player has a winning strategy.
It can be proved, using the axiom of choice , that there are games—even with perfect information, and where the only outcomes are "win" or "lose"—for which neither player has a winning strategy. The existence of such strategies, for cleverly designed games, has important consequences in descriptive set theory. Economists have used game theory to analyze a wide array of economic phenomena, including auctions , bargaining , duopolies and oligopolies , social network formation, and voting systems.
This research usually focuses on particular sets of strategies known as equilibria in games. These "solution concepts" are usually based on what is required by norms of rationality. The most famous of these is the Nash equilibrium.
A set of strategies is a Nash equilibrium if each represents a best response to the other strategies. So, if all the players are playing the strategies in a Nash equilibrium, they have no incentive to deviate, since their strategy is the best they can do given what others are doing.
The payoffs of the game are generally taken to represent the utility of individual players. Often in modeling situations the payoffs represent money, which presumably corresponds to an individual's utility. This assumption, however, can be faulty. A prototypical paper on game theory in economics begins by presenting a game that is an abstraction of some particular economic situation.
One or more solution concepts are chosen, and the author demonstrates which strategy sets in the presented game are equilibria of the appropriate type. Naturally one might wonder to what use should this information be put. Economists and business professors suggest two primary uses.
The first use is to inform us about how actual human populations behave. Some scholars believe that by finding the equilibria of games they can predict how actual human populations will behave when confronted with situations analogous to the game being studied. This particular view of game theory has come under recent criticism. First, it is criticized because the assumptions made by game theorists are often violated.
Game theorists may assume players always act rationally to maximize their wins the Homo economicus model , but real humans often act either irrationally, or act rationally to maximize the wins of some larger group of people altruism.
Game theorists respond by comparing their assumptions to those used in physics. Thus while their assumptions do not always hold, they can treat game theory as a reasonable scientific ideal akin to the models used by physicists. However, additional criticism of this use of game theory has been levied because some experiments have demonstrated that individuals do not play equilibrium strategies.
There is an ongoing debate regarding the importance of these experiments. Alternatively, some authors claim that Nash equilibria do not provide predictions for human populations, but rather provide an explanation for why populations that play Nash equilibria remain in that state. However, the question of how populations reach those points remains open. Some game theorists have turned to evolutionary game theory in order to resolve these worries.
These models presume either no rationality or bounded rationality on the part of players. Despite the name, evolutionary game theory does not necessarily presume natural selection in the biological sense.
Evolutionary game theory includes both biological as well as cultural evolution and also models of individual learning for example, fictitious play dynamics. On the other hand, some scholars see game theory not as a predictive tool for the behavior of human beings, but as a suggestion for how people ought to behave.
Since a Nash equilibrium of a game constitutes one's best response to the actions of the other players, playing a strategy that is part of a Nash equilibrium seems appropriate. However, this use for game theory has also come under criticism.
First, in some cases it is appropriate to play a non-equilibrium strategy if one expects others to play non-equilibrium strategies as well. However, this condition may often not hold. Suppose now that the utility functions are more complicated. The pursuer most prefers an outcome in which she shoots the fugitive and so claims credit for his apprehension to one in which he dies of rockfall or snakebite; and she prefers this second outcome to his escape.
The fugitive prefers a quick death by gunshot to the pain of being crushed or the terror of an encounter with a cobra. Most of all, of course, he prefers to escape. Suppose, plausibly, that the fugitive cares more strongly about surviving than he does about getting killed one way rather than another. This is because utility does not denote a hidden psychological variable such as pleasure. As we discussed in Section 2.
How, then, can we model games in which cardinal information is relevant? Here, we will provide a brief outline of their ingenious technique for building cardinal utility functions out of ordinal ones. It is emphasized that what follows is merely an outline , so as to make cardinal utility non-mysterious to you as a student who is interested in knowing about the philosophical foundations of game theory, and about the range of problems to which it can be applied.
Providing a manual you could follow in building your own cardinal utility functions would require many pages. Such manuals are available in many textbooks. Suppose that we now assign the following ordinal utility function to the river-crossing fugitive:. We are supposing that his preference for escape over any form of death is stronger than his preferences between causes of death. This should be reflected in his choice behaviour in the following way.
In a situation such as the river-crossing game, he should be willing to run greater risks to increase the relative probability of escape over shooting than he is to increase the relative probability of shooting over snakebite. Suppose we asked the fugitive to pick, from the available set of outcomes, a best one and a worst one. Now imagine expanding the set of possible prizes so that it includes prizes that the agent values as intermediate between W and L.
We find, for a set of outcomes containing such prizes, a lottery over them such that our agent is indifferent between that lottery and a lottery including only W and L. In our example, this is a lottery that includes being shot and being crushed by rocks. Call this lottery T. What exactly have we done here? Furthermore, two agents in one game, or one agent under different sorts of circumstances, may display varying attitudes to risk.
Perhaps in the river-crossing game the pursuer, whose life is not at stake, will enjoy gambling with her glory while our fugitive is cautious. Both agents, after all, can find their NE strategies if they can estimate the probabilities each will assign to the actions of the other. We can now fill in the rest of the matrix for the bridge-crossing game that we started to draw in Section 2. If both players are risk-neutral and their revealed preferences respect ROCL, then we have enough information to be able to assign expected utilities, expressed by multiplying the original payoffs by the relevant probabilities, as outcomes in the matrix.
Suppose that the hunter waits at the cobra bridge with probability x and at the rocky bridge with probability y. Then, continuing to assign the fugitive a payoff of 0 if he dies and 1 if he escapes, and the hunter the reverse payoffs, our complete matrix is as follows:.
We can now read the following facts about the game directly from the matrix. No pair of pure strategies is a pair of best replies to the other.
But in real interactive choice situations, agents must often rely on their subjective estimations or perceptions of probabilities. In one of the greatest contributions to twentieth-century behavioral and social science, Savage showed how to incorporate subjective probabilities, and their relationships to preferences over risk, within the framework of von Neumann-Morgenstern expected utility theory.
Then, just over a decade later, Harsanyi showed how to solve games involving maximizers of Savage expected utility. This is often taken to have marked the true maturity of game theory as a tool for application to behavioral and social science, and was recognized as such when Harsanyi joined Nash and Selten as a recipient of the first Nobel prize awarded to game theorists in As we observed in considering the need for people playing games to learn trembling hand equilibria and QRE, when we model the strategic interactions of people we must allow for the fact that people are typically uncertain about their models of one another.
This uncertainty is reflected in their choices of strategies. Consider the fourth of these NE. The structure of the game incentivizes efforts by Player I to supply Player III with information that would open up her closed information set.
Player III should believe this information because the structure of the game shows that Player I has incentive to communicate it truthfully. Theorists who think of game theory as part of a normative theory of general rationality, for example most philosophers, and refinement program enthusiasts among economists, have pursued a strategy that would identify this solution on general principles.
The relevant beliefs here are not merely strategic, as before, since they are not just about what players will do given a set of payoffs and game structures, but about what understanding of conditional probability they should expect other players to operate with. What beliefs about conditional probability is it reasonable for players to expect from each other?
Consider again the NE R, r 2 , r 3. Suppose that Player III assigns pr 1 to her belief that if she gets a move she is at node The use of the consistency requirement in this example is somewhat trivial, so consider now a second case also taken from Kreps , p. The idea of SE is hopefully now clear. We can apply it to the river-crossing game in a way that avoids the necessity for the pursuer to flip any coins of we modify the game a bit. This requirement is captured by supposing that all strategy profiles be strictly mixed , that is, that every action at every information set be taken with positive probability.
You will see that this is just equivalent to supposing that all hands sometimes tremble, or alternatively that no expectations are quite certain. A SE is said to be trembling-hand perfect if all strategies played at equilibrium are best replies to strategies that are strictly mixed. You should also not be surprised to be told that no weakly dominated strategy can be trembling-hand perfect, since the possibility of trembling hands gives players the most persuasive reason for avoiding such strategies.
How can the non-psychological game theorist understand the concept of an NE that is an equilibrium in both actions and beliefs? Multiple kinds of informational channels typically link different agents with the incentive structures in their environments.
Some agents may actually compute equilibria, with more or less error. Others may settle within error ranges that stochastically drift around equilibrium values through more or less myopic conditioned learning. Still others may select response patterns by copying the behavior of other agents, or by following rules of thumb that are embedded in cultural and institutional structures and represent historical collective learning.
Note that the issue here is specific to game theory, rather than merely being a reiteration of a more general point, which would apply to any behavioral science, that people behave noisily from the perspective of ideal theory. In a given game, whether it would be rational for even a trained, self-aware, computationally well resourced agent to play NE would depend on the frequency with which he or she expected others to do likewise. If she expects some other players to stray from NE play, this may give her a reason to stray herself.
Instead of predicting that human players will reveal strict NE strategies, the experienced experimenter or modeler anticipates that there will be a relationship between their play and the expected costs of departures from NE.
Consequently, maximum likelihood estimation of observed actions typically identifies a QRE as providing a better fit than any NE.
Rather, she conjectures that they are agents, that is, that there is a systematic relationship between changes in statistical patterns in their behavior and some risk-weighted cardinal rankings of possible goal-states. If the agents are people or institutionally structured groups of people that monitor one another and are incentivized to attempt to act collectively, these conjectures will often be regarded as reasonable by critics, or even as pragmatically beyond question, even if always defeasible given the non-zero possibility of bizarre unknown circumstances of the kind philosophers sometimes consider e.
The analyst might assume that all of the agents respond to incentive changes in accordance with Savage expected-utility theory, particularly if the agents are firms that have learned response contingencies under normatively demanding conditions of market competition with many players. All this is to say that use of game theory does not force a scientist to empirically apply a model that is likely to be too precise and narrow in its specifications to plausibly fit the messy complexities of real strategic interaction.
A good applied game theorist should also be a well-schooled econometrician. However, games are often played with future games in mind, and this can significantly alter their outcomes and equilibrium strategies. Our topic in this section is repeated games , that is, games in which sets of players expect to face each other in similar situations on multiple occasions. This may no longer hold, however, if the players expect to meet each other again in future PDs.
Imagine that four firms, all making widgets, agree to maintain high prices by jointly restricting supply. That is, they form a cartel. This will only work if each firm maintains its agreed production quota. Typically, each firm can maximize its profit by departing from its quota while the others observe theirs, since it then sells more units at the higher market price brought about by the almost-intact cartel.
In the one-shot case, all firms would share this incentive to defect and the cartel would immediately collapse. However, the firms expect to face each other in competition for a long period. In this case, each firm knows that if it breaks the cartel agreement, the others can punish it by underpricing it for a period long enough to more than eliminate its short-term gain. Of course, the punishing firms will take short-term losses too during their period of underpricing.
But these losses may be worth taking if they serve to reestablish the cartel and bring about maximum long-term prices. One simple, and famous but not , contrary to widespread myth, necessarily optimal strategy for preserving cooperation in repeated PDs is called tit-for-tat. This strategy tells each player to behave as follows:.
A group of players all playing tit-for-tat will never see any defections. Since, in a population where others play tit-for-tat, tit-for-tat is the rational response for each player, everyone playing tit-for-tat is a NE.
You may frequently hear people who know a little but not enough game theory talk as if this is the end of the story. It is not.
There are two complications. First, the players must be uncertain as to when their interaction ends. Suppose the players know when the last round comes. In that round, it will be utility-maximizing for players to defect, since no punishment will be possible. Now consider the second-last round. In this round, players also face no punishment for defection, since they expect to defect in the last round anyway.
So they defect in the second-last round. But this means they face no threat of punishment in the third-last round, and defect there too. We can simply iterate this backwards through the game tree until we reach the first round. Since cooperation is not a NE strategy in that round, tit-for-tat is no longer a NE strategy in the repeated game, and we get the same outcome—mutual defection—as in the one-shot PD. Therefore, cooperation is only possible in repeated PDs where the expected number of repetitions is indeterminate.
Of course, this does apply to many real-life games. Note that in this context any amount of uncertainty in expectations, or possibility of trembling hands, will be conducive to cooperation, at least for awhile. When people in experiments play repeated PDs with known end-points, they indeed tend to cooperate for awhile, but learn to defect earlier as they gain experience. Now we introduce a second complication.
Consider our case of the widget cartel. Suppose the players observe a fall in the market price of widgets. Perhaps this is because a cartel member cheated. Or perhaps it has resulted from an exogenous drop in demand. If tit-for-tat players mistake the second case for the first, they will defect, thereby setting off a chain-reaction of mutual defections from which they can never recover, since every player will reply to the first encountered defection with defection, thereby begetting further defections, and so on.
If players know that such miscommunication is possible, they have incentive to resort to more sophisticated strategies. In particular, they may be prepared to sometimes risk following defections with cooperation in order to test their inferences.
However, if they are too forgiving, then other players can exploit them through additional defections. In general, sophisticated strategies have a problem. Because they are more difficult for other players to infer, their use increases the probability of miscommunication.
But miscommunication is what causes repeated-game cooperative equilibria to unravel in the first place. The complexities surrounding information signaling, screening and inference in repeated PDs help to intuitively explain the folk theorem , so called because no one is sure who first recognized it, that in repeated PDs, for any strategy S there exists a possible distribution of strategies among other players such that the vector of S and these other strategies is a NE.
Thus there is nothing special, after all, about tit-for-tat. Real, complex, social and political dramas are seldom straightforward instantiations of simple games such as PDs.
Hardin offers an analysis of two tragically real political cases, the Yugoslavian civil war of —95, and the Rwandan genocide, as PDs that were nested inside coordination games. A coordination game occurs whenever the utility of two or more players is maximized by their doing the same thing as one another, and where such correspondence is more important to them than whatever it is, in particular, that they both do.
In these circumstances, any strategy that is a best reply to any vector of mixed strategies available in NE is said to be rationalizable. That is, a player can find a set of systems of beliefs for the other players such that any history of the game along an equilibrium path is consistent with that set of systems.
Pure coordination games are characterized by non-unique vectors of rationalizable strategies. The Nobel laureate Thomas Schelling conjectured, and empirically demonstrated, that in such situations, players may try to predict equilibria by searching for focal points , that is, features of some strategies that they believe will be salient to other players, and that they believe other players will believe to be salient to them.
Coordination was, indeed, the first topic of game-theoretic application that came to the widespread attention of philosophers. In , the philosopher David Lewis published Convention , in which the conceptual framework of game-theory was applied to one of the fundamental issues of twentieth-century epistemology, the nature and extent of conventions governing semantics and their relationship to the justification of propositional beliefs.
The basic insight can be captured using a simple example. This insight, of course, well preceded Lewis; but what he recognized is that this situation has the logical form of a coordination game. Thus, while particular conventions may be arbitrary, the interactive structures that stabilize and maintain them are not. Furthermore, the equilibria involved in coordinating on noun meanings appear to have an arbitrary element only because we cannot Pareto-rank them; but Millikan shows implicitly that in this respect they are atypical of linguistic coordinations.
In a city, drivers must coordinate on one of two NE with respect to their behaviour at traffic lights. Either all must follow the strategy of rushing to try to race through lights that turn yellow or amber and pausing before proceeding when red lights shift to green, or all must follow the strategy of slowing down on yellows and jumping immediately off on shifts to green. Both patterns are NE, in that once a community has coordinated on one of them then no individual has an incentive to deviate: those who slow down on yellows while others are rushing them will get rear-ended, while those who rush yellows in the other equilibrium will risk collision with those who jump off straightaway on greens.
However, the two equilibria are not Pareto-indifferent, since the second NE allows more cars to turn left on each cycle in a left-hand-drive jurisdiction, and right on each cycle in a right-hand jurisdiction, which reduces the main cause of bottlenecks in urban road networks and allows all drivers to expect greater efficiency in getting about.
Unfortunately, for reasons about which we can only speculate pending further empirical work and analysis, far more cities are locked onto the Pareto-inferior NE than on the Pareto-superior one. Conditional game theory see Section 5 below provides promising resources for modeling cases such as this one, in which maintenance of coordination game equilibria likely must be supported by stable social norms, because players are anonymous and encounter regular opportunities to gain once-off advantages by defecting from supporting the prevailing equilibrium.
This work is currently ongoing. While various arrangements might be NE in the social game of science, as followers of Thomas Kuhn like to remind us, it is highly improbable that all of these lie on a single Pareto-indifference curve.
These themes, strongly represented in contemporary epistemology, philosophy of science and philosophy of language, are all at least implicit applications of game theory. The reader can find a broad sample of applications, and references to the large literature, in Nozick Most of the social and political coordination games played by people also have this feature.
Unfortunately for us all, inefficiency traps represented by Pareto-inferior NE are extremely common in them. And sometimes dynamics of this kind give rise to the most terrible of all recurrent human collective behaviors. That is, in neither situation, on either side, did most people begin by preferring the destruction of the other to mutual cooperation. However, the deadly logic of coordination, deliberately abetted by self-serving politicians, dynamically created PDs.
Some individual Serbs Hutus were encouraged to perceive their individual interests as best served through identification with Serbian Hutu group-interests. That is, they found that some of their circumstances, such as those involving competition for jobs, had the form of coordination games. They thus acted so as to create situations in which this was true for other Serbs Hutus as well.
Eventually, once enough Serbs Hutus identified self-interest with group-interest, the identification became almost universally correct , because 1 the most important goal for each Serb Hutu was to do roughly what every other Serb Hutu would, and 2 the most distinctively Serbian thing to do, the doing of which signalled coordination, was to exclude Croats Tutsi.
That is, strategies involving such exclusionary behavior were selected as a result of having efficient focal points. But the outcome is ghastly: Serbs and Croats Hutus and Tutsis seem progressively more threatening to each other as they rally together for self-defense, until both see it as imperative to preempt their rivals and strike before being struck.
If Hardin is right—and the point here is not to claim that he is , but rather to point out the worldly importance of determining which games agents are in fact playing—then the mere presence of an external enforcer NATO?
The Rwandan genocide likewise ended with a military solution, in this case a Tutsi victory. But this became the seed for the most deadly international war on earth since , the Congo War of — Of course, it is not the case that most repeated games lead to disasters.
The biological basis of friendship in people and other animals is partly a function of the logic of repeated games. The importance of payoffs achievable through cooperation in future games leads those who expect to interact in them to be less selfish than temptation would otherwise encourage in present games.
The fact that such equilibria become more stable through learning gives friends the logical character of built-up investments, which most people take great pleasure in sentimentalizing. Furthermore, cultivating shared interests and sentiments provides networks of focal points around which coordination can be increasingly facilitated.
More directly, her claim was that conventions are not merely the products of decisions of many individual people, as might be suggested by a theorist who modeled a convention as an equilibrium of an n -person game in which each player was a single person. Similar concerns about allegedly individualistic foundations of game theory have been echoed by another philosopher, Martin Hollis and economists Robert Sugden , , and Michael Bacharach The explanation seems to require appeal to very strong forms of both descriptive and normative individualism.
The players undermine their own welfare, one might argue, because they obstinately refuse to pay any attention to the social context of their choices.
Binmore forcefully argues that this line of criticism confuses game theory as mathematics with questions about which game theoretic models are most typically applicable to situations in which people find themselves.
At 3, players would be indifferent between cooperating and defecting. Then we get the following transformation of the game:. Thus if the players find this equilibrium, we should not say that they have played non-NE strategies in a PD. Rather, we should say that the PD was the wrong model of their situation. What is at issue here is the best choice of a convention for applying mathematics to empirical description. Binmore is clearly right, and the majority of commentators have come to recognize that he is right, if we interpret the payoffs of games by reference to utility functions with unrestricted domains.
This is the overwhelmingly standard practice in both economics and formal decision theory. For a number of years this issue was regarded as closed in the mainstream literature.
However, Sugden argues in very recent work that there are reasons, quite independent of technical considerations about which conventions are most convenient for representing empirical interactions as games, for avoiding appeal to preferences over unrestricted domains in analyzing welfare that is, in doing normative economics.
On the basis of this argument, Sugden reverts to using game-theoretic models in which payoffs are restricted to objectively specifiable metrics, such as monetary returns. The substantive issues in welfare economics on which Sugden sheds now light are too interesting for a critic to reasonably refuse to engage with them out of mere stubbornness about adhering to convention in interpreting game representations. It is too soon to assess whether the advances in welfare analysis that Sugden seeks are sustainable under critical stress-testing.
If they prove not to be, then his motivation for an alternative convention on payoff interpretation will dissolve. I think it more likely, however, that a period of intensive innovation in welfare economics lies just ahead of us, and that in the course of this economists and other analysts will grow comfortable with operating two different representational conventions depending on problem contexts.
If that is indeed our future, then we can anticipate a further stage in which, because problem contexts tend not to remain conveniently isolated from one another, new formalism is demanded to allow both conventions to be operated in a single application without confusion.
But these speculations run well ahead of the current state of theory. Under this assumption, Bacharach, Sugden and Gold argue, human game players will often or usually avoid framing situations in such a way that a one-shot PD is the right model of their circumstances. Note that the welfare of the team might make a difference to cardinal payoffs without making enough of a difference to trump the lure of unilateral defection.
Suppose it bumped them up to 2. This point is important, since in experiments in which subjects play sequences of one-shot PDs not repeated PDs, since opponents in the experiments change from round to round , majorities of subjects begin by cooperating but learn to defect as the experiments progress.
The team reasoners then re-frame the situation to defend themselves.