🤼‍♂️

L4. Repeated Games

TOC

1. Introduction

Long-term (or repeated) interactions are very common. Examples include
  • Firms are engaged in competition over time
  • Most employment relationships last for a long time
  • Countries compete over tariffs years by years.
In a long-term relationship, one must consider how his/her current behavior will influence others’ behavior in the future, or how threats or promises about future behavior can affect current behavior. Take Prisoners’ Dilemma problem for example,
notion image
If the game is only played once, the unique Nash equilibrium is . However, the result may be different if the game is played more than once.
In these dynamic situations, one might care about “reputation”, which is often used to describe how a person’s past actions affect future beliefs and behavior.
We use repeated games to study such interactions among players. There are two types of repeated games:
  • finitely repeated games
  • infinitely repeated games
The results predicted by these two types of games differ dramatically.

2. Finitely Repeated Games

Consider the following game (two-stage prisoners’ dilemma game):
  • The two players play the simultaneous-move game twice;
  • Each player observes the outcome of the first play before the second game begins;
  • The payoff of each player in the whole game is simply the sum of payoffs in both stages (i.e. no discounting)
This game is an example of the two-stage imperfect information games. We can use backwards induction to solve the game.
  • In stage 2, the unique Nash equilibrium is , in which each player receives 1.
  • In stage 1, the two players play the following equivalent game:
notion image
  • Hence, is the unique Nash equilibrium in stage 1.
  • The subgame-perfect outcome: is played in both periods
Let denote a static game of complete information in which players 1 through n simultaneously choose actions through from the action spaces through , and the payoffs are through . The game is called the stage game of the repeated game.
Given a stage game , let denote the finitely repeated game in which is played T times, with the outcomes of all preceding plays observed before the next play begins. The payoffs for are simply the sum of the payoffs from the T stage games.
Subgame-perfect Nash equilibrium
Proposition: If the stage game has a unique Nash equilibrium, then for any finite , the repeated game has a unique subgame-perfect outcome: the Nash equilibrium of is played in every stage.
In the Prisoners’ Dilemma example, the unique outcome in each period is regardless of how many times the game is played. The result in the above proposition can be extended even if itself is a dynamic game of complete information.
Multiple Nash equilibria
If the stage game has multiple Nash equilibria, then there may be subgame-perfect outcomes of the repeated game in which, for any , the outcome of stage is not a Nash equilibrium of .
Consider the following game:
notion image
There are two Nash equilibria: and . Suppose the game is repeated twice. The outcome in stage 2 is either or . It is possible that the first-stage outcome is in a subgame-perfect Nash equilibrium. Consider, for example, player ’s strategy is
  • play in the first stage;
  • play if the first-stage outcome is ; otherwise, play .
It can be verified that the strategy profile constitutes a subgame-perfect Nash equilibrium, in which the first-stage outcome is . The intuition of the player’s is that serves as a reward and serves as a punishment.

3. Infinitely Repeated Games

3.1 Definition & Strategy

Present Value:
Let be the payoff in stage . Given the discount factor , the present value of the infinite sequence of payoffs is
Definition of Infinitely Repeated Games:
Recall is the stage game of repeated games. Given a stage game , let denote the infinitely repeated game in which is played forever and players share the discount factor .
Remarks:
  • For each , the outcomes of the preceding plays are observed before the -th stage begins.
  • Each player’s payoff in is the present value of the player’s payoffs from the infinite sequence of stage games.
Consider the following infinitely repeated game of Prisoners’ Dilemma:
  • In stage 1, the two players play the stage game and receive payoffs and ;
  • In stage , the players observe the actions chosen in the preceding stages, and then play to receive and ;
  • The payoff of the infinitely repeated game is the present value of the sequence of payoffs: for player .
Strategies
There are infinitely many strategies for the players. Some common strategies include:
  • noncooperative strategy: play in every stage
    • If both players adopt the noncooperative strategy, is repeated forever
  • (grim) trigger strategy: play in the first stage; in stage , if the outcome of all preceding stages has been , then play ; otherwise, play
    • Using a trigger strategy, player cooperates until someone fails to cooperate, which trigger a switch to noncooperation forever
  • tit-for-tat strategy
  • carrot-and-stick strategy

3.2 Nash Equilibria

Claim 1:
Both players adopting the noncooperative strategy is a Nash equilibrium.
Proof:
Assume player plays in every stage; Then player ’s best response is also “to play in every stage”.
Claim 2:
Both players adopting the trigger strategy in a Nash equilibrium if and only if .
Proof:
Assume player has adopted the trigger strategy. We seek to show player ’s best response is also to adopt the trigger strategy. It suffices to check when “follow trigger strategy” ≥ “every deviations”.
There are many types of deviations, we can divide them into two categories:
  • Case 1: At the node where the outcome in a previous stage is not .
    • Since player plays forever, player ’s best response is also to play forever.
  • Case 2: In the first stage or in a stage where all the preceding outcomes have been .
    • If player follows the trigger strategy, then he should play in this stage, and the outcome from this stage onwards will be in every stage. Thus, player ’s payoff from this stage onwards is
    •  
    • If player plays in this stage (not follow the trigger strategy), player still plays in this stage but forever form the next stage. And then player will also play from the next stage onwards, which is his optimal choice. This means player ’s payoff from this stage onwards is
    • Playing the trigger stategy is optimal iff
    • Summarizing case 1 and case 2, the trigger strategies constitute a Nash equilibrium for the game iff .
Note that is not fixed for every game, it varies case by case. But the deduction process is the same.

3.3 Subgame-perfect Nash Equilibrium

Claim:
The trigger-strategy Nash equilibrium in the infinitely repeated Prisoners’ Dilemma game is subgame perfect.
Proof:
In an infinitely repeated game, a subgame is characterized by its previous history. The subgames can be grouped as follows:
  • Case 1: Subgames whose previous histories are always a finite sequence of
    • In such a subgame, the players’ strategies are again the trigger strategies, which is a Nash equilibrium for the whole game and thus for the subgame as well.
  • Case 2: Subgames whose previous histories contain other outcomes different from
    • In such a subgame, the players’ strategies are simply to repeat all the time in the subgame, which is also a Nash equilibrium.
We can also use an approach based on One-deviation principle to directly show that trigger strategies constitute a subgame-perfect Nash equilibrium:
One-deviation principle: A strategy profile is a subgame-perfect Nash equilibrium if and only if, for each player and for each subgame, no single deviation would raise player ’s payoff in the subgame.

3.4 Collusion between Cournot Duopolists

In the Cournot model, the unique Nash equilibrium involving each firm producing , and earning a profit of . If there is a monopolist, then the monopoly quantity is and profit is .
  • If the two firms can collude to produce each, then they jointly produce the monopoly quantity . Each of them obtains a profit of .
  • If firm produces , then the best response for firm is to produce . In this case, firm ’s profit is , while firm ’s profit is .
Consider the infinitely repeated game based on the Cournot stage game when both firms have the discount factor .
Trigger strategy:
  • in period 1, produce half of the monopoly quantity,
  • in period , produce if both firms have produced in all preceding periods; otherwise, produce the Cournot quantity .
Here the cooperative output is and noncooperative output is .
notion image
Trigger-strategy SPNE
Claim: For the infinitely repeated game with the Cournot stage game, both firms playing the trigger strategy is a subgame-perfect Nash equilibrium if and only if .
Proof:
Suppose firm has adopted the trigger strategy, we need to show firm ’s best response is also to play the trigger strategy in any subgame. There are again two types of subgames to be checked.
  • Case 1: if a quantity other than has been chosen by any firm before the current period, then firm chooses from this period onwards. The best response for firm is also to choose from this period onwards. Thus, playing the trigger strategy is optimal in this subgame.
  • Case 2: in period , if the outcomes of all previous periods are . Firm ’s payoff from this period onwards if it follows the trigger strategy is
    • If firm deviates from the trigger strategy by choosing a quantity other than , then firm produces in this period, but from period onwards. Thus, firm ’s present value of the payoffs from period onwards is
      • Therefore, trigger strategy is the best response for firm to firm ’s trigger strategy iff
Two-phase strategy SPNE
Two-phase strategy (or carrot-and-stick) strategy:
  • In the first period, produce half of the monopoly quantity
  • In period , produce if both firms produce or both firms produce in period ; otherwise, produce .
This strategy involve a (one-period) punishment phase in which the firm produces and a (potentially infinite) collusive phase in which the firm produces
Such a strategy punishes:
  • a firm deviating from the collusive phase;
  • a firm for deviating from the punishment phase
If both firms produce , the profit of each firm is denoted by , where . If firm produces , the best reponse of firm is to produce and the corresponding profit is denoted by .
There are two types of subgames:
  • collusive subgames: the outcome of previous period is either or ;
  • punishment subgames: the outcome of previous period is neither nor .
We use one-deviation principle to show that both firms adopting the two-phase strategy is a subgame-perfect Nash equilibrium. Suppose firm has adopted the two-phase strategy. In collusive subgames, if firm also adopts the two-phase strategy, its payoff is
If firm deviates in this period only, then firm still chooses in this period but in the next period. Then firm would choose in this period and in the next period. They payoff from deviation is
Thus, choosing the two-phase strategy is optimal for firm iff
In punishment subgames, it is optimal to choose the two-phase strategy for firm iff
Both firms adopting the two-phase strategy is a subgame-perfect Nash equilibrium iff (1) and (2) hold. The two conditions can be written as
Intuitions: The gain this period from deviating must not exceed the discounted value of the loss next period from punishment.
Consider the case ,
  • condition (3) is satisfied iff or
  • condition (4) is satisfied iff
  • Thus, two-phase strategies constitute a subgame-perfect Nash equilibrium in the game iff

3.5 Folk Theorem

In the Prisoners’ Dilemma example, the cooperative outcome, which cannot be achieved in stage game or in any finitely repeated game, can be sustained if the stage game is played forever. The condition is that the discount factor is sufficiently large (or players are sufficiently patient).
Folk Theorem claims that: Cooperative equilibria which do not exist in static games can be achieved in repeated games.
Feasible Payoff
The payoffs are feasible in the stage game if they are a convex combination (i.e., a weighted average, where the weights are all nonnegative and sum to one) of the pure-strategy payoffs of .
In the Prisoners’ Dilemma example, all pure-strategy payoffs , , and are feasible. The payoffs are also feasible, which can be achieved if player adopts the mixed-strategy for . All feasible payoffs are depicted in the shaded region of following figure
Feasible Payoff
Feasible Payoff
Average Payoff
Given the discount factor , the average payoff of the infinite sequence of payoffs is
Both present value and average payoff can present a player’s payoff in an infinitely repeated game. Average payoff is directly comparable to the payoffs from the stage game.
Friedman Theorem
Let be a finite, static game of complete information. Let denote the payoffs from a Nash equilibrium of , and let denote any feasible payoffs from , where for each player . If the discount factor is sufficiently close to one, then there exists a subgame-perfect Nash equilibrium in the infinitely repeated game that achieves as the average payoff. Friedman theorem is part of the Folk theorem.
Friedman Theorem
Friedman Theorem

Loading Comments...