Minimal Belief Revision leads to Backward Induction
Andr´es Perea Maastricht University This Version: August 2004
Abstract In this paper we present a model for games with perfect information in which the players, upon observing an unexpected move, may revise their beliefs about the opponents' preferences over outcomes. For a given profile P of preference relations over outcomes, we impose the following three principles: (1) players initially believe that opponents have preference relations as specified by P; (2) players believe at every instance of the game that each opponent is carrying out an optimal strategy; and (3) beliefs about the opponents' preference relations over outcomes should be revised in a minimal way. It is shown that every player whose preference relation is given by P, and who throughout the game respects common belief in the events (1), (2) and (3), has a unique optimal strategy, namely his backward induction strategy in the game induced by P. We finally show that replacing the minimal belief revision principle (3) by the more modest requirement of Bayesian updating leads exactly to the Dekel-Fudenberg procedure in the game induced by P. Keywords: Belief revision, minimal belief change, backward induction, dynamic games. Journal of Economic Literature Classification: C72 1. Introduction In this paper we are concerned with the problem of how to model rationality in dynamic games. In a purely static setting, rational choice can be formalized by the requirement that players hold beliefs about the opponents' strategy choices, and choose strategies that are optimal against these beliefs. In a dynamic game, however, it may happen that a player's initial belief about the opponents' strategy choices will be contradicted by the opponents' real behavior later on in the game. At this instance, the player must revise his belief about the opponents as to explain the for their helpful comments. The Netherlands. E-mail: a.perea@ke.unimaas.nl, Tel: +31-43-3883922, Fax: +31-43-3884874. Web: www.personeel.unimaas.nl/a.perea
The author wishes to thank Geir Asheim, Giacomo Bonanno, Dov Samet, Robert Sugden and Shmuel Zamir
Department of Quantitative Economics, Maastricht University, P.O. Box 616, 6200 MD Maastricht,
a
1u Au b
c
2u Bu d
1u f u
D
e
Cu
Figure 1
observed behavior. The two basic questions that we shall focus on are: How should the player revise his beliefs? and What consequences does this have for the player's own behavior? As to illustrate the problem of belief revision, consider the game tree depicted in Figure 1. The symbols A,B,C and D denote the different outcomes, or terminal nodes, that can be reached at the end. Suppose that player 2 holds preference relation DBCA over these outcomes, meaning that he strictly prefers D over B, strictly prefers B over C, and strictly prefers C over A. Assume, moreover, that (1) player 2 initially believes that player 1 has preference relation CADB over outcomes, and (2) that player 2 initially believes that player 1 initially believes that player 2 will choose c. If player 2 believes, moreover, that player 1 chooses optimally given this initial belief and preference relation, he must believe at the beginning of the game that player 1 chooses a. Suppose now that player 2 observes that player 1 has chosen action b. In this case, he must conclude that his initial belief about player 1 was wrong, and therefore needs to be replaced by a new belief that explains the event of player 1 choosing b. A possible explanation could be that player 2's initial belief about player 1's preference relation and initial belief were correct, but that player 1 has mistakenly chosen b. Although mistakes can never be ruled out in human decision making, we adopt as a guiding principle for our approach that players, at every possible instance of the game, believe that each of the opponents is carrying out an optimal strategy. That is, if a player observes an opponent's move that contradicts his current belief about the opponent, then he deems the event that the opponent has acted rationally more plausible than the event that the opponent has made a mistake. We shall refer to this principle as belief in sequential rationality. As a consequence of this principle, player 2, upon observing b, must either revise his belief about player 1's preference relation, or revise his belief about player 1's initial belief about player 2's strategy choice, since otherwise the move b cannot be rationalized. A problem that arises here is that player 2 may choose between various belief revision procedures that rationalize the move b, and these different belief revision procedures may lead to different choices for player 2. For instance, player 2 may explain the move b by the new theory that player 1 still has preference
2
relation CADB, but that player 1 initially believes that player 2 will choose d (and not c, as player 2 believed initially). In this case, player 2 believes upon observing b that player 1 would choose e at his final decision node, and hence player 2 will choose c when adopting this belief revision procedure. Another possibility for player 2 would be to believe, upon observing b, that player 1 has preference relation DCBA (instead of CADB, as player 2 believed initially), without revising his belief about player 1's initial belief about player 2's strategy choice. Accordingly, player 2 believes that player 1 would choose f at his final decision node, and hence player 2 will choose d when using this second belief revision procedure. This phenomenon raises the question whether the various belief revision procedures that player 2 may adopt can be classified according to some natural criterion. A generally accepted principle in belief revision theory is that belief changes should be as small as possible, while being able to explain the newly observed information (see Schulte (2002) for an excellent discussion of the idea of minimal belief revision, and an overview of the various formalizations thereof in belief revision theory). The intuition behind this principle is that the current beliefs of a decision maker reflect, in some sense, the "best possible theory" that he can produce about the state of affairs given his current information. If these beliefs are contradicted by new observations, the decision maker therefore attempts to explain these new events by disturbing his previous beliefs as little as possible. When applying the minimal belief revision principle to our example, it may seem, at first glance, that both belief revisions above are "equally" distant from the initial belief, as they both require one belief change: in the first belief revision, player 2 changes his belief about player 1's initial belief about player 2's choice, while leaving his belief about player 1's preference relation invariant, whereas in the second revision player 2 changes his belief about player 1's preference relation, while maintaining his belief about player 1's initial belief about player 2's choice. The problem with this argument is that the aforementioned description of player 2's beliefs only reveals a small part of the complete beliefs: player 2 does not only hold a belief about player 1's preference relation and about player 1's belief about player 2's strategy choice, but also about player 1's belief about player 2's preference relation, and about player 1's belief about player 2's belief about player 1's strategy choice, and so forth. Consequently, if player 2, in the first belief revision, changes his belief about player 1's initial belief about player 2's strategy choice from c to d, then player 2 must rationalize this belief change by changing, in addition, (1) his belief about player 1's initial belief about player 2's preference relation, or (2) his belief about player 1's initial belief about player 2's belief about player 1's choice. Namely, if player 2 initially believes that player 1 initially believes that player 2 chooses c, and player 2 moreover believes that player 1 believes in sequential rationality, as we have imposed above, then player 2 must initially believe that (1) player 1's initial belief about player 2's preference relation, together with (2) player 1's initial belief about player 2's belief about player 1's strategy choice, are such that player 1 deems c optimal for player 2. Therefore, if player 2, in the first belief revision, changes his theory by believing that player 1 initially believes that player 2 chooses d, then this must be justified by adapting the belief about player 1's initial belief about player 2's preference 3
relation and/or about player 2's belief about player 1's strategy choice. The bottomline is thus that the first belief revision requires, apart from the change in belief about player 1's initial belief about player 2's strategy choice, at least one more belief change. On the other hand, in the second belief revision player 2 changes his belief about player 1's preference relation, and this belief change alone is already sufficient to explain the move b by player 1. In other words, it is not necessary to complement this belief change about player 1's preference relation by an additional belief change concerning player 1. We may therefore conclude that the first belief revision requires more belief changes than the second, and the first belief revision should thus be discarded on the basis of minimal belief revision. Since it is always possible to explain every unexpected opponent's move by a revision about the opponent's preference relation alone, the argument above implies that the principle of minimal belief revision always leads to belief revisions that concern only the opponent's preference relation, and not the opponent's belief about the other players' strategy choices. The reason, as we have seen above, is that revisions about opponent i's belief about player j's strategy choice must always be rationalized by a belief change about player i's belief about player j's preference relation or belief. The first minimal belief revision principle we adopt states therefore that players, when revising their beliefs about the opponents upon observing an unexpected move, should only revise their beliefs about the opponents' preference relations, and leave the other components of their beliefs unaltered. An important implicit assumption we make when applying this first minimal belief revision principle is that all beliefs of any order are viewed as "equally important". That is, the belief that player i has about player j's strategy choice is considered "as important" as player i's belief about player j's belief about the other players' strategy choices. This assumption seems natural once we impose common belief in sequential rationality, as we shall do in our model, since in this case player i's belief about player j's belief about his opponents' strategies serves as a justification for player i's belief about player j's strategy choice. Common belief in sequential rationality implies, namely, that player i should believe that player j's strategy choice is optimal given player i's belief about player j's preference relation, and given player i's belief about player j's conditional beliefs about the opponents' strategy choices. Hence, player i's second order beliefs justify player i's first order beliefs about the opponents' strategy choices, and therefore both beliefs may be viewed as "equally important ". Similarly, common belief in sequential rationality about the opponents' viewed as "equally important" in our model, thereby justifying the first minimal belief revision principle formulated above. The question remains whether the second belief revision described above, in which player 2 changes his belief about player 1's preference relation from CADB to DCBA, may be regarded a minimal belief revision. As to answer this question, compare this belief revision with a third belief revision defined as follows: upon observing move b, player 2 believes that player 1's preference relation is CDBA, while leaving his other beliefs about player 1 invariant. Also this 4
implies that player i's k-th order beliefs justify his (k - 1)-th order beliefsall strategy choices for any k. For this reason, we assume that beliefs of
possible orders are
belief revision procedure rationalizes the move b by player 1, and according to this new belief, player 2 expects player 1 to choose e at his final decision node, which leads to choice c for player 2 (and not d, as with the second belief revision). We argue that the third belief revision is smaller than the second, as the revised belief CDBA about player 1's preference relation is "closer" to the initial belief CADB than DCBA. As to formalize what we mean by "closer", we measure the distance between two preference relations by counting the pairs of outcomes on which the two relations induce different pairwise rankings. For instance, CADB and DCBA induce different rankings on {CADB, {A,D} and C,D} is On and CDBA
A,B}
{disagree, and therefore the distance between boththat3.the the otherishand,2.
second belief revision represents{A,D solely on {A,B} and }, meaning
distance only Consequently, the
a larger belief change than the third. According to the minimal
belief revision principle, the second belief revision should thus be discarded. As a second minimal belief revision principle we therefore require that players, when revising their belief about an opponent's preference relation, should make sure that their new belief is as close as possible to the previous belief, given the distance measure formalized above, provided that the new belief should rationalize the newly observed move(s) by this opponent. Note that in the distance measure mentioned above, player i attaches equal weight to each pairwise ranking that player j could possibly have over outcomes. As such, it is implicitly assumed that player i is equally certain (or uncertain, if you wish) about player j's various pairwise rankings of outcomes. Of course, there are many practical examples that violate this condition, as some pairwise rankings seem intuitively less ambiguous than others, and hence belief revisions about such "less ambiguous" pairwise rankings should have a larger weight than belief revisions about "more ambiguous" pairwise rankings. The distance measure used above also assumes that there is no "correlation" between the various outcomes of the game. More precisely, it is assumed that a belief revision about an opponent's pairwise ranking of two outcomes A and B should not be a reason per s´e to change your belief about the opponent's ranking of two other outcomes C and D. This condition may be violated in practical examples in which, intuitively, some outcomes are similar to each other. Assume, for instance, that in the example of Figure 1 it were the case that outcome A is similar to outcome C, and outcome B is similar to outcome D. As above, suppose that player 2 initially believes that player 1 has preference relation CADB. Then, player 2's second belief revision in which, upon observing b, he believes that player 1 has preference relation DCBA should now be regarded a smaller belief change than player 2's third belief revision, in which he believes, upon observing b, that player 1 has preference relation CDBA. The reason is that the third belief revision contradicts the similarities of the outcomes: if player 2 believes that player 1 prefers B over A, he should also believe that player 1 prefers D over C. However, for the remainder of this paper we shall assume that players are equally certain about each of the opponents' pairwise rankings, and that there is no correlation between outcomes, and hence the distance measure introduced above makes intuitive sense. The obvious question is now whether the third belief revision, in which player 2 changes his 5
belief about player 1's preference relation from CADB to CDBA, is a minimal belief change. The answer must be "yes". Recall that player 2 initially believes that player 1 initially believes that player 2 chooses c. Hence, if player 2 changes his belief about player 1's preference relation upon observing move b, he must make sure that the new belief ranks outcome B over outcome A. However, it can be seen easily that this requires at least a distance of 2 with respect to the initial belief CADB, and therefore the third belief revision has minimal distance. Although there are several minimal belief revisions for player 2 in this example, it may be verified that every minimal belief revision has the property that player 2, upon observing move b, still believes that player 1 prefers outcome C over outcome D (as player 2 believed initially). The intuition is that, in order to explain the unexpected move b by player 1, it is not necessary to change the belief about player 1's relative ranking of C and D, and hence, by minimal belief revision, player 2 should not do so. But if this is true, minimal belief revision always leads player 2 to believe, upon observing b, that player 1 would choose e at his final decision node, and hence player 2 will always choose c when acting in accordance with minimal belief revision. Note that strategy c is the exactly the backward induction strategy for player 2 in the game where the players' preferences over outcomes are given by P1 = CADB and P2 = DBCA, respectively. By putting P = (P1,P2), we have thus derived the following result for this example: If player 2 has preference relation P2, initially believes that player 1 has preference relation P1, believes in sequential rationality and satisfies minimal belief revision, then there is a unique optimal strategy for player 2, namely his backward induction strategy in the game induced by P. Our main theorem in this paper (Theorem 5.2) shows that a similar result is true for general over outcomes, where Pi belongs to player i. If player i holdspreference relation Pi, and respects common belief in the events that (1) players initially believe that their opponents have preference relations as given by P, (2) players believe in sequential rationality, and (3) players satisfy minimal belief revision, then player i has a unique optimal strategy, namely his backward induction strategy in the game induced by P. Here, we say that player i respects common belief in the event that players have a certain property if player i has this property, player i believes throughout the game that all players have this property, player i believes throughout the game that other players believe throughout the game that all players have this property, and so on. The concepts of (common belief in) belief in sequential rationality and minimal belief revision may thus be viewed as a possible foundation for backward induction, which constitutes one of the oldest ideas in game theory. The main difference with other foundations for backward induction, such as Aumann (1995), Samet (1996), Balkenborg and Winter (1997), Stalnaker (1998), Asheim (2002) and Asheim and Perea (2004), is that in our model, players are assumed to interpret every unexpected move by an opponent as a rational move, whereas this is not the case in the latter foundations. Moreover, in our model players are allowed to revise their beliefs about the opponents' preference relations over outcomes in order to rationalize such unexpected moves, while the aforementioned foundations do not model this possibility, at least not explicitly. 6
games with perfect information. Consider a profile P = (Pi)i I of strict preference relations
Other foundations for backward induction that do allow players to revise their beliefs about the opponents' utilities during the game can be found in Perea (2003a, 2003b). The main difference with our approach here is that the latter two foundations use proper belief revision, rather than minimal belief revision, as a criterion to restrict the possible belief revision procedures. Proper belief revision states that whenever player i at decision node hi revises his belief about player j, then he must not change his belief about player j's relative ranking of two strategies sj and s0j, if both sj and s0j could have led to hi. The intuition is that such belief changes would be "unnecessary " in order to explain the event that hi has been reached. In Section 4.2 we establish a formal relationship between minimal belief revision and proper belief revision, which proves to be important for deriving the announced theorem on backward induction. The outline of this paper is as follows. In Section 2 we develop an epistemic model for games with perfect information that allows us to formalize statements such as "player i believes at decision node hi that player j has preference relation Pj ", or "player i believes at decision node hi that player j believes at decision node hj that player k chooses strategy sk", and so on. In this model, the relevant characteristics of a player are represented by a so-called type, defining a preference relation over outcomes and prescribing at every decision node a conditional belief about the opponents' strategy choices and types. Since types hold conditional beliefs about the opponents' types, they therefore also hold conditional beliefs about the opponents' preference relations, and about the opponents' conditional beliefs about the other players' strategy choices, and so forth. We then use this model to define the notion of common belief. In Section 3 we formalize what it means that a type "believes in sequential rationality", "satisfies minimal belief revision" and "initially believes in some profile P of preference relations". In Section 4 we derive some properties of minimal belief revision and belief in sequential rationality that are important for establishing our theorem on backward induction. In Section 5 we first show that for every i that respects common belief in the events that types (1) believe in sequential rationality, (2) satisfy minimal belief revision, and (3) initially believe that types hold preference relations as specified by P. We therefore guarantee that common belief in these three events is always possible. We then show that every player i type that holds preference relation Pi, and respects common belief in the three events above, has a unique optimal strategy, namely his backward induction strategy in the game induced by P. In Section 6, finally, we explore the consequences of replacing the minimal belief revision principle by the more modest requirement of Bayesian updating. It is shown that the resulting concept allows for any strategy that survives the DekelFudenberg procedure in the game induced by P. Here, by the Dekel-Fudenberg procedure we mean one round of elimination of weakly dominated strategies, followed by iterative elimination of strongly dominated strategies in the game induced by P. Hence, common belief in minimal belief revision may be seen as a property that closes the gap between the Dekel-Fudenberg procedure and the concept of backward induction.
profile P = (Pi)i I of preference relations, and every player i, there is at least one type for player
7
2. The Epistemic Model 2.1. Games with Perfect Information A dynamic game is said to be with perfect information if every player, at each instance of the game, observes the opponents' moves that have been made until then. Formally, an extensive for every playerSi a finite set Hi of decision nodes, for every decision node hi Himodeledset A(hi) of available actions, and a finite set Z of terminal nodes. Perfect information is by the assumption that each decision node by itself constitutes an information set. By A we denote the set of all actions, whereas H denotes the collection of all decision nodes. We assume that no chance moves occur. The definition of a strategy we shall employ coincides with the concept of a plan of action, as discussed in Rubinstein (1991). The difference with the usual definition is that we require a strategy only to prescribe an action at those decision nodes that the same strategy does not avoid. Formally, let Hi Hi be:aHcollection ofmappingi decision nodes,every the prescibed action si(hi) deviates from the path to h. Such amapping si : Hi A is called a strategy for player i if Hi is exactly the collection of player i decision nodes not avoided by ~ si. Obviously, every strategy si can be obtained by first prescribing an action at all player i decision nodes, that is, constructing a strategy in the classical sense, and then deleting those player i decision nodes that are avoided by it. For a given strategy si Si,be denoteofbyplayersi)i
form structure
with perfect information consists of a finite game tree, a finite set I of players, a finite
~ necessarily containing all decision nodes, and let si
hi Hi someplayer ~
player not
~i
A be a
belonging to
available action si(hi) A(hih).ifFor a given decisionHnode h pathnot necessarily
i, we say that si avoids there is some hi ~i on the to h at which
~
prescribing at H,
Hi(
the collection of player i decision nodes that are not avoided by si. Let i strategies that do not avoid h. Then,it is clear that a profile (si)i
strategies. For a given decision node h
Si we the set
H and player i, we denote by Si(h) the set of player
I of strategies reaches a
decision node h if and only if si Si(h) for all players i. 2.2. Types
We shall now formally model the players in the extensive form structure Sstrict, under uncertainty. Our primary assumption is that every player i holds a
as decision makers transitive preference relation Pi over the set of terminal nodes, and holds at the beginning of the game, as well as at every decision node hi Hi, a conditional belief about the opponents' strategy choices. Throughout this paper, whenever we write "preference relation over terminal nodes", we always assume that it is strict, complete and transitive. In order to keep the model as simple as possible, we assume that the conditional beliefs about the opponents' strategy choices assign at each instance of the game probability one to a single strategy choice for each of the opponents, that is, we restrict ourselves to point-beliefs.1 On top of this we assume that every si is optimal for player i given a probabilistic belief µi over the opponents' strategies, then there is some single 8
complete and
1 This assumption may be justified by the following property of games with perfect information: if a strategy
player, throughout the game, holds a conditional point-belief about the opponents' preference relations over the terminal nodes, and about the opponents' conditional beliefs about the other players' strategy choices. Moreover, each player also holds, at every instance, a conditional point-belief about the opponents' conditional beliefs concerning the other players' preferences and concerning the other players' conditional beliefs about their opponents' strategy choices, and so on. Repeating this argument inevitably leads to infinite hierarchies of conditional beliefs. Similarly to Ben-Porath (1997), Battigalli and Siniscalchi (1999) and Perea (2004), we model such hierarchies of conditional beliefs by means of epistemic types. Let h0 be the decision node Battigalli and Siniscalchi (1999) and Perea (2004), one can{construct type spaces Ti for every that marks the beginning of the game, and let Hi = Hi h0}. By applying techniques from player i such that every type ti Ti can be identified with a vector
(Pi(ti), (sj(ti,hi),tj(ti,hi))hiHi,j ), 6=i (2.1)
where Pi(ti) is a preference relation on the set of terminal nodes, sj(ti,hi) is a strategy in Sj(hi) and tj(ti,hi) is a type in Tj. The interpretation is that ti holds preference relation Pi(ti), and believes at every decision node hi that player j chooses the strategy sj(ti,hi) and is of type tj(ti,hi). Since such tj, in turn, holds a preference relation over the terminal nodes and a conditional belief about the other players' strategy choices, every type ti holds at every instance a conditional belief about player j's preference relation and about player j's conditional beliefs about the other players' strategy choices. In a similar fashion, one may derive from (2.1) conditional beliefs about conditional beliefs about ... about conditional beliefs, of arbitrary length.
In the sequel of this paper, we often write s-i(ti,hi) = (sj(ti,hi))j 6=i
to denote ti's conditional
belief at hi about the opponents' strategy choices, and denote by t-i(ti,hi) = (tj(ti,hi))j conditional belief of ti at hi about the opponents' types. 2.3. Common Belief 6=i the
of types,and let ti be a specific type for player i. We say that ti believes Eif tj(ti,hi) Ethat every opponent j and every hi Hi. In words, ti believes at every instance of the game the opponents' types belong to E. We recursively define B1(E) = {t E | t believes E} and Bk(E) = {t Bk (E) | t believes Bk (E)}
iI i T we denote the collection of all types for all players. Let E T be some subset for
-1 -1 strategy profile within the support of µi against which si is optimal. (Ben-Porath (1997) shows this fact in his proof of Lemma 1.2.1). Hence, every strategy choice in a game with perfect information that is justified by a probabilistic belief, can also be justified by a point-belief.
By T =
9
B(E). Hence, ti belongs to E,believes at every instance that all opponents' types belong to E, believes at every instance that all opponents' types believe at every instance that all other players' types belong to E, and so on. 3. Belief in Sequential Rationality and Minimal Belief Revision In this section we formalize the following three conditions: (1) a type should believe, throughout the game, that his opponents choose optimal strategies, (2) a type should revise his belief about an opponent's characteristics in a minimal way, and (3) a type should initially believe that the opponents' preference relations are given by some profile P = (Pi)i . We shall refer to
I
these conditions as belief in sequential rationality, minimal belief revision and initial belief in P, respectively. 3.1. Belief in Sequential Rationality Let (si,ti) be a pair consisting of a strategy and a type for player i. Recall that Hi(si) is the set of player i decision nodes that are not avoided by si. For a given hi Hi(si), let z(si,ti,hi) denote the terminal node that is reached if the game would start at hi, player i would choose according to si, and the opponents would choose according to the conditional belief s-i(ti,hi) that forholds ti at hi about the opponents' strategy choices. We say that si is sequentially rational ti if
for every decision node hi )Hover)the i
the terminal node z(s0i,ti,hi relation Pi(ti).
(si there is no strategy s0i ,hi()hwith Si i
) such that ti strictly prefers
respect to his preference terminal node z(si,ti
for all k 2. Let B(E) = k N Bk(E). We say that ti respects common belief in E if t
Definition 3.1. We say that type ti believes in sequential rationality if at every hi Hi, and for every opponent j, the conditional belief (sj(ti,hi),tj(ti,hi)) about player j's strategy-type pair is such that sj(ti,hi) is sequentially rational for tj(ti,hi). At this stage it is important to note that not every type has a sequentially rational strategy. Consider, for instance, the extensive form structure in Figure 2. Take a type t1 for player 1 with the preference relation ECDBA over the terminal nodes. Let player 1's decision nodes be denoted by h11 and h21, respectively. Suppose that t1 believes at h11 that player 2 chooses the strategy (d,g), but believes at h21 that player 2 chooses (d,h). The unique strategy that is optimal for t1 at h11 is (b,e). However, (b,e) is not optimal for t1 at h21, which implies that t1 has no sequentially rational strategy. In Section 4 we shall prove that minimal belief revision and belief in sequential rationality are sufficient to imply that a type has a sequentially rational strategy. Note also that a type ti can have at most one sequentially rational strategy.
10
b d
a
Au 1u c Bu 2u e Cu 1u
f g h u
E
Du 2u
Figure 2
3.2. Minimal Belief Revision Suppose now that a type ti observes that decision node hi Hi hasInbeen reached, buti cannot rationalize this event by means of his previous beliefs about player j. this case, type t may be led to revise his belief about (1) player j's preference relation, or (2) player j's conditional beliefs about the other players' strategy choices, or both. As we have argued already in the introduction, a belief revision about player j's conditional beliefs about the other players' strategy choices must always be justified by an additional belief revision about player j's conditional beliefs about the other players' preferences and/or conditional beliefs. On the other hand, a belief revision about player j's preference relation need not be rationalized by an additional belief change. For this reason, the principle of minimal belief revision requires players to explain unexpected moves solely by belief revisions about the opponents' preference relations. Formally, let h1i and h2i be two decision nodes for player i such that h2i follows h1i , and let tj(ti,h1i ) and tj(ti,h2i) be ti's conditional beliefs at h1i and h2i about player j's type. For a player j type tj and preference relation Pj, we denote by (tj,Pj) the type that has preference relation Pj and holds the same conditional beliefs about the opponents' strategy-type pairs as tj. Minimal belief revision requires that tj(ti,h1i ) and tj(ti,h2i ) differ only by their preference relation, that is, tj(ti,h2i ) = (tj(ti,h1i ),Pj) for some preference relation Pj. This requirement is formalized as condition (1) in the definition of minimal belief revision below. The additional requirement we impose is that player i's belief revision about player j's preferences must be as small as possible. More precisely, we shall introduce a distance measure between preference relations, and require that ti's new belief about player j's preference relation should be as close as possible to his old belief, given that the new belief should rationalize the event of reaching the decision node hi. Definition 3.2. Let P1 and P2 be two preference relations on the set of terminal nodes Z. We P2 disagree. define the distance d(P1,P2) as the number of unordered pairs {z1,z2} in Z at which P1 and
does not, or P1 ranks z1 strictly below z2 but P2 does not. The distance measure thus defined Here, we say that P1 and P2 disagree at {z1,z2} if P1 ranks z1 strictly above z2 but P2 11
coincides with the measure adopted in Ha and Haddawy (1998), and may be interpreted as a Hamming distance between preference relations, when the latter are interpreted as collections of pairwise rankings. hi if tj has a sequentiallyrational strategybelonging to Sj(hi). The conditions (2) and (3) in For a decision node hi Hi and type tj Tj, we say that tj rationalizes the event of reaching the definition of minimal belief revision below state that type ti, upon reaching decision node hi, should change his belief about player j's preference relation in a minimal way, provided that his new belief about player j's type rationalizes the event of reaching hi. such that h2i follows h1i and no other hi Hi lies between h1i and h2i . We are now ready to define minimal belief revision. Let ti be a type and let h1i ,h2i Hi Definition 3.3. We say that ti satisfies minimal belief revision at h2i if for every opponent j there is some preference relation Pj such that (1) tj(ti,h2i ) = (tj(ti,h1i ),Pj ), (3) there is no other preference relation P~j such that (tj(ti,h1i ),P~j ) rationalizes the event of
2
2
(2) tj(ti,h2i ) rationalizes the event of reaching h2i , and
2 2
reaching h2i , and d(Pj(tj(ti,h1i )),P~j ) < d(Pj(tj(ti,h1i )),Pj ). We finally say that ti satisfies minimal belief revision if it does so at every decision node h2i . 3.3. Initial Belief in P 2 2
Let P = (Pi)i I be some profile of preference relations.
Definition 3.4. We say that type ti initially believes in P if Pj(tj(ti,h0)) = Pj for all opponents j. Here, tj(ti,h0) is ti's initial belief about player j's type, and Pj(tj(ti,h0)) thus reflects ti's initial belief about player j's preference relation. Note, however, that ti may change his belief about j's preference relation if the game moves from h0 to some other decision node hi. 4. Properties of Minimal Belief Revision As a preparatory step towards our backward induction theorem, we first derive some properties of minimal belief revision that will be applied in Section 5 for showing the announced relationship with backward induction. 4.1. Existence of Sequentially Rational Strategies In the example of Figure 1 we have seen that not every type has a sequentially rational strategy. Namely, if the type t1 believes at his first decision node h11 that player 2 chooses (d,g), but 12
believes at his second decision node h21 that player 2 chooses (d,h), then t1 has no sequentially rational strategy. The reason for this is that t1's conditional beliefs at h21 contradict Bayesian updating: t1's beliefs at h11 about player 2's behavior are compatible with the event of reaching h21, and therefore Bayesian updating implies that t1's beliefs at h21 should coincide with his beliefs at h11. We shall now provide a formalization of the above mentioned Bayesian updating requirement and show in Lemma 4.2 that it guarantees the existence of a sequentially rational strategy. Let h1i and h2i be two decision nodes in Hi such that h2i follows h1i, and there is no player i decision node between h1i and h2i . Definition 4.1. We say that ti satisfies Bayesian updating at h2i if for every opponent j for which sj(ti,h1i ) Sj(h2i ), it holds that sj(ti,h2i ) = sj(ti,h1i ). In other words, if ti's belief at h1i about player j's strategy choice does not contradict the event of reaching h2i , then ti should maintain at h2i his previous belief about player j's strategy choice. We say that ti satisfies Bayesian updating if it does so at every decision node. Lemma 4.2. Every type that satisfies Bayesian updating has a sequentially rational strategy. The proof of this lemma is based on Theorem 3.1 in Perea (2002). The `if' part of this theorem states that if an "updating system" satisfies "updating consistency", then every "locally sequentially rational" strategy is sequentially rational. In order to state the `if ' part of this theorem more precisely, we must first formally define the terms "updating system", "updating consistency" and "locally sequentially rational strategy". As to simplify matters we shall define these objects directly within our special context of games with perfect information, and restrict ourselves to "updating systems" that always assign probability one to one particular strategy choice for every opponent. The reason for the latter is that the conditional belief vectors in our epistemic model always assign probability one to one particular strategy choice for each opponent. An updating system for player i is a vector ci = (ci(hi))hiH) where ci(hplayer-i(hconditional
decision node hi the .opponents'(strategy
belief at hi about
) for every i
Hi Here, S-i hi) = ×jchoices.),For cgiven represents 6=i i
a
i ) S
i
's
i
Sj(hi and (hi
decision node hi and conditional
beliefs ci(hi),c0i(hi) holds(hthatwe say that cpro )les S-i i ), i
(hi and c0i(hi) are equivalent at hi if for every fi (si,ci(hi)) and (si,c0i(hi)) lead to the same
strategy si Si(hi) it
the strategy
terminal node. Hence, ci(hi) and c0i(hi) only differ at decision nodes that do not precede nor follow hi. The updating system ci is called updating consistent if for every two decision nodes h1i and h2i where h1i precedes h2i and ci(h1i ) S (h2i), it holds that ci(h2i) and ci(h1i) are equivalent
at h2i . An extended strategy for player iis a-ivector s~i = (~i(hi))hiHithis
s every decision node hi. The difference with a strategy as defined in where s~i(hi) Athat for
paper is thus an
(hi)
extended strategy also prescribes actions at decision nodes that are avoided by it, whereas a strategy does not. An extended strategy s~i is called locally sequentially rational with respect to 13
an updating system ci and a preference relation Pi over the terminal nodes if at every decision node hi the action s~i(hi) is optimal against the actions prescribed by ci(hi) and s~i in the subgame that follows hi. We say that a (non-extended) strategy si is locally sequentially rational with respect to ci and Pi if there is an extended strategy s~i such that s~i is locally sequentially rational with respect to ci and Pi, and s~i coincides with si at decision nodes in Hi(si). Finally, a strategy si is called sequentially rational with respect to ci and Pi if at every decision node hi ci(Hhi()siis)
i
there is no other strategy s0i Si(hi) such that the terminal node reached by s0i and strictly preferred by Pi over the terminal node reached by si and ci(hi). The `if' part of Theorem 3.1 in Perea (2002), when applied to our specific context, can now be stated as follows. Lemma 4.3. Let ci be an updating system that is updating consistent, and let Pi be a preference relation over the terminal nodes. Then, every strategy that is locally sequentially rational with respect to ci and Pi is also sequentially rational with respect to ci and Pi. We are now ready to prove Lemma 4.2. Proof of Lemma 4.2. Let ti be a type with preference relation Pi that satisfies Bayesian updating. We show that ti has a sequentially rational strategy. For every decision node hi Hi
define ci(hi) = s-i(ti,hi))Since
is an updating system.
, which is an element in S-updating, i
ti satisfies Bayesian
(hi). Hence, the vector ci = (ci(hi))hithe Hi
it immediately follows that
updating system ci is updating consistent. By Lemma 4.3 we then know that every locally sequentially rational strategy with respect to ci and Pi is sequentially rational with respect to ci and Pi. Since it is clear that every sequentially rational strategy with respect to ci and Pi is also sequentially rational for ti, it suffices to show that there is a locally sequentially rational strategy with respect to ci and Pi. By a simple backward induction procedure, one can define for every player i decision node theactions prescribed by ci(hi) at the opponents' decision nodes following hi, and (2) his own hi Hi some action a(hi) such that every action a(hi) is optimal with respect to Pi against (1) construction of the actions a(hi), the extended strategy s~i = (a(hi))hiHi is locally sequentially with respect to ci and Pi. Let si be the unique strategy that coincides with s~i at all decision nodes in Hi(si). Hence, si is locally sequentially rational with respect to ci and Pi. As we have seen above, this implies that si is sequentially rational with respect to ci and Pi. But then, si is sequentially rational for ti. This completes the proof of this lemma. ¥ We shall now prove that minimal belief revision and belief in sequential rationality lead to Bayesian updating. Lemma 4.4. Let ti be a type that believes in sequential rationality and satisfies minimal belief revision. Then, ti satisfies Bayesian updating.
actions a(h0i) at decision nodes h0i Hi following hi. Then, byrational
14
Proof. Choose a type ti that believes in sequential rationality and satisfies minimal belief revision. Let h1i ,h2i be two decision nodes in Hi such that h2i follows h1i , and no player i decision node is between h1i and h2i . Let j be an opponent for which sj(ti,h1i ) belongs to Sj(h2i ). As ti believes in sequential rationality, it must be the case that sj(ti,h1i ) is sequentially rational for rationalizes the event of reaching h2i . By minimal belief revision, it must therefore be the case type tj(ti,h1i ). The fact that sj(ti,h1i ) Sj(h2i ) then implies that the type tj(ti,h1i ) itself already that tj(ti,h2i ) = tj(ti,h1i ). Since ti believes in sequential rationality, and since sj(ti,h1i ) is the unique sequentially rational strategy for tj(ti,h1i ), it follows that sj(ti,h2i ) = sj(ti,h1i ), which implies that ti satisfies Bayesian updating. This completes the proof. ¥ By combining Lemma 4.2 and Lemma 4.4, we obtain the following corollary. Corollary 4.5. Let ti be a type that believes in sequential rationality and satisfies minimal belief revision. Then, ti has a sequentially rational strategy. 4.2. Relation with Proper Belief Revision We next prove that minimal belief revision and belief in Bayesian updating leads to proper belief revision: a concept that has been put forward in Perea (2003a, 2003b and 2004). This result will prove to be crucial for establishing the announced relationship with backward induction. Informally, proper belief revision states that a player who wishes to revise his beliefs at decision node h about opponent j's preference relation, should not change his belief about the opponent's relative ranking of two strategies sj and s0j if both sj and s0j could have led to h. The intuition is that the player, upon arriving at h, cannot exclude any of the opponent's strategies sj and s0j, and therefore there is no reason for him to change his belief about the opponent's relative ranking of sj and s0j. In order to introduce proper belief revision formally, we need some more Hi some reached if the gamewould start at hi, player i would choose according to si, and player i's notation and definitions. Let ti be a type for player i, and hi terminal decision node. For a given strategy si Si(hi), recall that z(si,ti,hi) denotes the node that would be strictly prefers strategy si over strategy-is0i at decision node hi if ti strictly prefers the terminal opponents would choose according to s (ti,hi). For two strategies si,s0i Si(hi),we say that ti node z(si,ti,hi) over the terminal node z(s0i,ti,hi). Now, let ti be a type for player i, let j 6= i0j be an opponent, let hi and hj be decision nodes for players i and j, respectively, and let sj,s be two player j strategies in Sj(hj). Definition 4.6. We say that ti believes at hi that player j at hj strictly prefers strategy sj over strategy s0j if type tj(ti,hi) strictly prefers sj over s0j at hj. Now, let ti be a type for player i, and let h1i ,h2i be two decision nodes in Hi such that h2i follows h1i and no other player i decision node is between h1i and h2i .
15
Definition 4.7. We say that ti satisfies proper belief revision at h2i if for every opponent j,
every decision node hj Hti and every twothat j
Sj(h2i) the following holds: only if ti believes so at h1i . believes at h2i
strategies sj,s0j that belong to both Sj(hj) and
player j at hj strictly prefers sj over s0j if and
Note that sj,s0j Sj(h2i ) implies that both sj and s0j could have led to h2i . We say that type ti satisfies proper belief revision if ti does so at each of his decision nodes. Before showing that minimal belief revision and belief in Bayesian updating imply proper belief revision, we prove the following lemma. It states that the distance between two preference relations P1 and P2 can be reduced strictly by applying the following procedure: First, take an roles of a and b{in P}2 without changing the roles of the other nodes. unordered pair a,b of terminal nodes on which P1 and P2 disagree, and then interchange the Lemma 4.8. Let P1 and P2 be two preference relations on the set Z of terminal nodes, and arbitrary utility representation of P2, and let the utility function u2 be given by let {a,b} be an unordered pair of terminal nodes on which P1 and P2 disagree. Let u2 be an ~
u2(z) = ~
u222(((ba)),, if zz = a,
), if = b,
u u z
otherwise.
Let P~2 be the preference relation induced by u2. Then, d(P1,P~2) < d(P1,P2). ~
Proof. For a given type ti Ti, decision node hi Hi, and strategy si si and i's opponents would act according to s-i(ti,hi). Let Z(ti,hi) = {z(si,ti,hi) | si Si(hi)}
The proof can be found in the appendix. We are now able to prove the following result. Theorem 4.9. Let ti be a type that satisfies minimal belief revision and believes that every opponent satisfies Bayesian updating. Then, ti satisfies proper belief revision. z(si,ti,hi) is the terminal node that is reached if the game would start athi, player i chooses
Si(hi), recall that
be the set of terminal nodes that can be reached if the game would start at hi and player i's
opponents would act according to s-i(tsatis)fi.es Let ti be a type for player i that i ,hi
minimal belief revision and believes that every
opponent satisfies Bayesian updating. We prove that ti satisfies proper belief revision. Suppose, contrary to what we want to prove, that ti does not satisfy proper belief revision. Then, there node is between h1i and h2i , an opponent j, a decision node hj Hj and two strategies sj,s0j must be two decision nodes h1i ,h2i Hi such that h2i follows h1i and no other player i decision 16
Sj(hj) Sj(h2i) suchatthat:2 ti believesj(ati,h1i)that playertjj(tistrictlyand does not believe so h2i. Let t1j = t t and t2j = ,h2i), h1i
prefers sj over s0j at hj, but
let Pj and Pj denote the 1 2
preference relations of t1j and t2j, respectively. Since ti satisfies minimal belief revision, it must be the case that t2j = (t1j,Pj ). In particular, t1j and t2j hold the same conditional belief at hj
2
about the opponents' strategy choices, that is, s-j(t1j,hj) = s-j(t2j0j,hj)h. j, Since ti believes at h1i that player j strictly prefers sj over s at
but does not believe
so at h2i , we may conclude that Pj strictly prefers z(sj,t1j,hj) over z(s0j,t1j,hj), but Pj strictly prefers z(s0j,t1j,hj) over z(sj,t1j,hj). Let u2j be some arbitrary utility representation of Pj , and let the utility function u2j be given by ~
2
1 2
u2j(z) = ~
2
u2j(z(s0j,t1j,hj)), if z = z(sj,t1j,hj), u2j(z(sj,t1j,hj)), if z = z(s0j,t1j,hj), u2j(z), otherwise.
~ 1 2
(4.1)
z(s0j,t1j,hj)}, we know by Lemma24.8(that~jd)(Pj ,P~j ) < d(the,Pj ). Let P~j be the preference relation induced by u2j. Since Pj and Pj disagree on {z(sj,t1j,hj), contradict our assumption that ti satisfies minimal belief revision. Since t1j = tj(ti,h1i ), and ti believes that player j satisfies Bayesian updating, it follows that t1j satisfies Bayesian updating. Since t2j = (t1j,Pj ) and t~2j = (t1j,P~j ), we have that also t2j and t~2j satisfy Bayesian updating.
1 2
We now prove that the type t~j = t1j,P2 rationalizes
Pj 1 2
event of reaching h2i , which would
2 2
By Lemma 4.2 we know that t2j and t~2j have a sequentially rational strategy, which must then be unique. Let s2j and s~2j be the unique sequentially rational strategies for types t2j and t~2j, respectively. Recall that, by definition, t2j = tj(ti,h2i ). As ti satisfies minimal belief revision, t2j must rationalize the event of reaching h2i and hence s2j Sj(hh2i)).. In order to prove that t~2j
rationalizes the event of reaching h2i , we must show that s~2j Sj(action h2i let hj,h2i be
h2i. Choose some hj assumption,
s2j(hj) = a(hj,h2i). By
order to show that s~2j
For every hj Hj precedingwe ,provea(that )2j(hj)theaunique2i )
2 i
at hj leading to h2i . In
Hj(~2j) preceding h2i. As s2j rational),for have(t1jthat2)h, Hjmeansand
we t2j =
j
which
(s2j)
s2j is sequentially ,Pj in
Sj(h2i), s
s~
= (hj,h Sj(h2i
for all hj
Hj(~2j) preceding s
particular that s2j is optimal for t2j at hj. Hence, Pj strictly prefers z(s2j,t1j,hj) over all other nodes in Z(t1j,hj). We distinguish two cases. Since Pj strictly prefers z(s2j,t1j,hj) over all other nodes in Z(t1j,hj), we have by (4.1) that P~j Case 1. Suppose that z(s2j,t1j,hj) 6= z(s0j,t1j,hj), where s0j is the strategy as discussed above.2
2
also strictly prefers z(s2j,t1j,hj) over all other nodes in Z(t1j,hj). This implies that s2j is optimal for z(sj, tj(ti, h1i ), hj ) = z(s0j, tj(ti, h1i ), hj ). By minimal belief revision of ti, we have that tj(ti, h1i ) and tj(ti, h2i ) hold the same conditional beliefs, and hence z(sj,tj(ti,h2i),hj) = z(s0j,tj(ti,h2i),hj), which implies that ti believes at h2i that player j is indifferent between sj and s0j. 17
2 Note that if ti believes at h1i that player j is indifferent at hj between sj and s0j, then necessarily
2
t~2j at hj. Since we know that s~2j is optimal for t~2j at hj, it follows that s~2j(hj) = s2j(hj) = a(hj,h2i), which was to show. Case 2. Suppose that z(s2j,t1j,hj) = z(s0j,t1j,hj). In this case, the terminal node z(s2j,t1j,hj) follows both hj and hj. Hence, it must be the case that hj precedes or follows hj. We distinguish two subcases. Case 2.1. Suppose that hj precedes hj. Since z(s2j,t1j,hj) follows hj, it must be the case
that s-j(t1j,hj) (tS1j,hj(hj=. Wej(havejseenAsabovethat(htjz satis1jfi,hj)Bayesian0j,t1j,hj)
-j )
es 1 j updating, which then
implies that s-j ) s- t1j,h2).
sj Sj
), it follows that sj Sj(over andother hj) that
z(sj,t1j,hj) = z(sj,t1j,hj). Since Pj strictly prefers (s2j,t = z(s all
nodes in Z(t1j,hj), it follows by (4.1) that P~j strictly prefers z(sj,t1j,hj) = z(sj,t1j,hj) over all 2
other nodes in Z(t1j,hj). Hence, sj is optimal for t~2j at hj. Since, by assumption, s~2j is optimal Hence, s~2j(hj) = a(hj,h2i ), which was to show. for t~2j at hj, it follows that s~2j(hj) = sj(hj). Since sj Sj(h2i ), we have that sj(hj) = a(hj,h2i ). Case 2.2. Suppose that hj precedes hj. As z(s0j,t1j,hj) = z(s2j,t1j,hj) follows hj, we must have
that s-j(jt).,hj) Sj (hSj)(.hByand -j j
s-j(t1j,h
1 j Bayesian updating of t1j, we may then conclude that s-j(t1j,hj) =
Since s
2 i )
hj precedes h2i, we have that sj 1Sj(h) ) as well.1j,hj
j
Combined
with the fact that s-j2(,tt1j1j,hjj)) = s(s0j(,t1j,hjj)), this implies that z(sin,tZ,h1j,hj=), zit(sfollows )bySince that P~j strictly prefers z(sj,t1j,hj) = z(sj,t1j,hj) over all other nodes in Z(t1j,hj). We may thus j ,t . j j j
Pj strictly prefers z(sj 2 -j
,h = z ,h over all other nodes (t (4.1)
2
conclude that sj is optimal for t~2j at hj. As s~2j is optimal for t~2j at hj as well, it follows that s~2j(hj) = sj(hj). By assumption, sj Sjwas)toimplying that sj(hj) = a(hj,h2i). Hence, we may (h2i , conclude that s~2j(hj) = a(hj,h2i ), which show. From Case 1 and 2 we may therefore conclude that s~2j(hj) = a(hj,h2i ) for all decision nodes rational strategy for t~2j, this leads to the conclusion that t~2j = (t1j,P~j ) rationalizes the event of hj Hj(~2j) preceding h2i. This, in turn, implies that s~2j Sj(h2i). As s~2 is the unique sequentially reaching h2i . Since we have seen that d(Pj ,P~j ) < d(Pj ,Pj ), we have thus found a preference
s 2 j
1 2 1 2
relation P~j with the properties that (t1j,P~j ) rationalizes the event of reaching h2i , but d(Pj ,P~j ) < 2 2 1 2
d(Pj ,Pj ). This, however, contradicts our assumption that ti satisfies minimal belief revision. Therefore, the assumption that ti does not satisfy proper belief revision cannot be true. Hence, ti must satisfy proper belief revision. This completes the proof of our theorem. ¥ 5. Relation with Backward Induction In this section we show that common belief in the events that types (1) believe in sequential rationality, (2) satisfy minimal belief revision and (3) initially believe in some profile P = (Pi)i
I
of preference relations, leads to backward induction in the game induced by P. We divide this result into two parts. In the first part, Theorem 5.1, it is shown that for every player there is at
1 2
t1j
18
least one type that respects common belief in the three events listed above. As such, common belief in these three events is always possible. The second part, Theorem 5.2, shows that every type ti that has preference relation Pi and satisfies common belief in the events that types believe in sequential rationality, satisfy minimal belief revision and initially believe in P, must choose his backward induction strategy in the game induced by P. For the proof of Theorem 5.1 and the statement of Theorem 5.2, we need the following structure with perfect information, and P = (Pi)i interpreted as a game, and the backward induction procedure in the game (S,P) leadsstobea the unique strategy that chooses the backward induction action a(hi) at every hi Hi(si ). We Theorem 5.1. Let S be an extensive formofstructure with perfect information, and P =thereis a profile of preference relations on the set terminal nodes. Then, for every player i a type ti that respects common belief in the events that types believe in sequential rationality, satisfy minimal belief revision, and initially believe in P.
definitions. Let S be an extensive formset a profile of preference relations on the I
of terminal nodes. Then, the pair (S,P) may be
i
unique backward induction action a(hi) at every decision node hi. For every player i, let refer to si as the backward induction strategy for player i in (S,P).
(Pi)i I
strategy for player j with the following properties: (1) at every decision node hj (2) (sjevery Proof. For every player i, decision node hi Hi and opponent j 6= i, let sj(hi) be the unique decision node hj Hj(sj)(.hiThen, preceding hi, it prescribesathe backwardSinduction action
Hj at (hi))
a(hj) in the game (S,Pbackward by construction, sjs(jhiin isS,Pabout
sj(h0) coincides with the
) For every player i, denote by i the conditional belief vector
induction strategy (
preceding hi, the strategy sj(hi) prescribes the unique action that leads to hi, and )) not
strategy in ). j (hi). Moreover,
the opponents' strategy
choices in which player i, at every decision node hi Hi, believes that eachsequentiallyj rational opponent chooses
the strategy sj(hi) Sj(to)the construction, thevector
unique strategy that is
i and the preference relation Pi is his
for player i with respect
hi . By conditional belief
backward induction strategy si in (Sj,P6=).i.player Fix a player i and an opponent
conditional belief Pj(hi) for player i about
For every decision node hi Hproceed
j's preference relation. We
i we shall define a
recursively,
Pj(h1i) has already been defined for all h1i Hi that precede h2i. Letdecisionthe unique decision starting from h0. At h0, let Pj(h0) = Pj. Now, take a decision node h2i beHi and suppose that node in Hi preceding h2i with the property that no other player i node lies between h1i and h2i. By assumption, Pj(h1i) has already been defined. We can now choose a preference relation Pj(h2i ) with the following properties: (1) the conditional belief vector j for player j and the preference relation Pj(h2i ) together rationalize the event of reaching h2i , and (2) there is no preference relation P~j(h2i ) that together with j rationalizes the event of reaching h2i , and for which d(Pj(h1i ),P~j(h2i )) < d(Pj(h1i ),Pj(h2i )). In this way, a conditional belief Pj(hi) about player 19 h1i
j's preference relation can be defined for every player i, every opponent j, and every decision
node hi Hinow . We may
construct a set of types T = {tj(hi) | i,j I, i 6= j and hi Hi}
with the following properties: (1) the preference relation for tj(hi) is equal to Pj(hi); (2) the conditional belief vector of tj(hi) about the opponents' strategy choices is given by j, that is, sk(tj(hi),hj) = sk(hj) for all hj Hjnodeallj opponents kopponent
and
(3) the conditional belief of tj(hi) at decision tk(hj). h Hj about
6= j;
k's type is equal to
We now prove that every type tj(hi) T respects common belief in the event that types believe in sequential rationality, satisfy minimal belief revision and initially believe in P. By construction, every type t T believes, at each of his decision nodes, that each of his opponents' P. Initial belief in P. Choose an arbitrary type tj(hi) T. By definition, tj(hi) believes at h0 that every opponent k is of type tk(h0). Since tk(h0) has preference relation Pk(h0) and since, by construction, Pk(h0) = Pk, we have that tj(hi) believes at h0 that every opponent k has preference relation Pk. Hence, tj(hi) initially believes in P. Take some decision nodes h1j and h2j such that h2j follows h1j and no other player j decision node Minimal belief revision. Choose an arbitrary type tj(hi) T and some opponent k 6= j. is between h1j and h2j. By definition, tj(hi) believes at h1j that player k has type tk(h1j), and believes at h2j that player k has type tk(h2j). By construction of tk(h1j) and tk(h2j) we know that tk(h1j) has preference relation Pk(h1j), that tk(h2j) has preference relation Pk(h2j), and that tk(h1j) and tk(h2j) have identical conditional beliefs about the opponents' strategies and types. As such, tk(h2j) = (tk(h1j),Pk(h2j)). Moreover, by construction of the preference relation Pk(h2j), we know that (1) the conditional belief vector k for player k and the preference relation Pk(h2j) together rationalize the event of reaching h2j, and (2) there is no preference relation P~k(h2j) that together with k rationalizes the event of reaching h2j, and for which d(Pk(h1j),P~k(h2j)) < d(Pk(h1j),Pk(h2j)). As k is the conditional belief vector for types tk(h1j) and tk(h2j) about the opponents' strategies, it follows from (1) and (2) above that tj(hi) satisfies minimal belief revision. and some opponent k. By definition, tj(hi) believes at hj that opponent k has type tk(hj)and Belief in sequential rationality. Take some arbitrary tj(hi) T, a decision node hj Hj chooses strategy sk(hj). We prove that sk(hj) is sequentially rational for tk(hj). We do so by induction on the number of decision nodes in Hj that precede hj. 20
T believes in
sequential rationality, satisfies minimal belief revision, and initially believes in types belongs to T. It istherefore sufficient to show that every type tj(hi)
Assume first that hj is not preceded by any decision node in Hj, that is, hj = h0. In this case, tk(h0) has preference relation Pk(h0) which, by construction, is equal to Pk. Since tk(h0)'s conditional belief vector about the opponents' strategy choices is given by k, it follows that tk(h0) has a unique sequentially rational strategy, namely his backward induction strategy in (S,P), which is sk = sk(h0). We thus have that sk(h0) is sequentially rational for tk(h0), which was to show. h2j it holds that sk(h1j) is sequentiallyrational for tk(h1j). We prove that sk(h2j)is sequentially Now, take some decision node h2j Hj\{h0} and assume that for every h1j Hj preceding
tk(h2j) at hk. We distinguish two cases. rational for tk(h2j). Hence, we must prove for every hk Hk(sk(h2j)) that sk(h2j) is optimal for of sk(h2j), we have that sk(h2j)prescribes the backward induction action a(h0k) at every player k Case 1. Assume that hk Hk(sk(h2j)) and that hk does not precede h2j. Then, by definition decision node h0k weakly following hk. Suppose, contrary to what we want to prove, that sk(h2j) prefers sk(h2j) over sk(h2j) at hk. Now, let the strategy s~k(h2j)be such that (1) s~k(h2j) coincides is not optimal for tk(h2j) at hk. Hence, there is some sk(h2j) Sk(hk) such that tk(h2j) strictly with sk(h2j) at all decision nodes in Hk(~k(h2j)) weakly following hk, and (2) s~k(h2j) coincides with s with sk(h2j) in the subgame starting athk, we may conclude that tk(h2j) strictly prefers s~k(h2j) sk(h2j) at all other decision nodes in Hk(~k(h2j)). Since sk(h2j) Sk(hk) Skas(hsk)(,handcoincides over sk(h2j) at hk. Since tj(hi) believes at h2j that player k is of type tk(h2j), the following holds: tj(hi) believes at h2j that player k, at hk, strictly prefers s~k(h2j) over sk(h2j), (5.1)
not precede h2j, it follows that s~k(h2j)
s Sk(hk) Sk(h2j) as well. Moreover,
2 j
~
hk does
2 j )
where both s~k(h2j) and sk(h2j) are in Sk(hk) Sk(h2jbelief ). We have seen above that tj(hi) satisfies minimal
revision. Moreover, since every t T
satisfies Bayesian updating, we may conclude that tj(hi) believes that every opponent satisfies Bayesian updating. By Theorem 4.9 we may thus conclude that tj(hi) satisfies proper belief revision. Now, let h1j be the unique decision node in Hj that precedes h2j and for which no other player j decision node is between h1j and h2j. Since both s~k(h2j) and sk(h2j) are in Sk(hk)Sk(h2j), proper belief revision of tj(hi) together with (5.1) implies the following:
tj(hi) believes at h1j that player k, at hk, strictly prefers s~k(h2j) over sk(h2j). (5.2)
As h1j precedes h2j, and hk does not precede h2j, we must have that hk does not precede h1j. Hence, sk(h1j) prescribes at every player k decision node h0k weakly following hk the backward induction a(h0k), just as sk(h2j) does. Together with (5.2), this yields: tj(hi) believes at h1j that player k, at hk, strictly prefers s~k(h2j) over sk(h1j). 21
Since tj(hi) believes at h1j that player k has type tk(h1j), it follows that sk(h1j) is not sequentially rational for tk(h1j), which contradicts our induction assumption that sk(h1j) is sequentially rationot preceding h2j. This completes Case 1. nal for tk(h1j). Hence, we may conclude that sk(h2j) is optimal for tk(h2j) at every hk Hk(sk(h2j)) sion, as we have seen above, it must be the case that the type tk(h2j) = tk(tj(hi),h2j) rationalizes Case 2. Assume that hk Hk(sk(h2j)) precedes h2j. Since tj(hi) satisfies minimal belief revithe event of reaching h2j. Hence, tk(h2j) has a sequentially rational strategy sk(h2j) in Sk(h2j). Suppose, contrary to what we want to prove, that sk(h2j) is not optimal for tk(h2j) at hk. Then, necessarily, tk(h2j) strictly prefers z(sk(h2j),tk(h2j),hk) over z(sk(h2j),tk(h2j),hk). (5.3) Since sk(h2j) and sk(h2j) are both in Sk(h2j), they coincide on all player k decision nodes preceding h2j. Hence, by (5.3), there must be some player k decision node h0k not preceding h2j such that reach h0k. By Bayesian updating of tk(h2j), we then have that s-k(tk(h2j),h0k) = s-k(tk(h2j),hk). (1) s-k(tk(h2j),hk) S-k(h0k), and (2) (sk(h2j),s-k(tk(h2j),hk)) and (sk(h2j),s-k(tk(h2j),hk)) both This implies that z(sk(h2j),tk(h2j),hk) = z(sk(h2j),tk(h2j),h0k) and z(sk(h2j),tk(h2j),hk) = z(sk(h2j),tk(h2j),h0k). Together with (5.3), we may conclude that tk(h2j) strictly prefers z(sk(h2j),tk(h2j),h0k) over z(sk(h2j),tk(h2j),h0k), which means that sk(h2j) is not optimal for tk(h2j) at h0k. However, this contradicts our findings in Case 1, as h0k does not precede h2j. Therefore, sk(h2j) must be optimal for tk(h2j) at hk. This completes Case 2. optimal for tk(h2j) at hk. As such, sk(h2j) is sequentially rational fortk(h2j). Since tj(hi) believes By combining the cases 1 and 2, we have shown for every hk Hk(sk(h2j)) that sk(h2j) is at h2j that player k is of type tk(h2j) and chooses strategy sk(h2j), and since this holds for every h2j and every opponent k, it follows that tj(hi) believes in sequential rationality, which was to show. satisfies min-
imal belief revision and initially believes in P. As every type t Tbelief types are in T, it holds that every type t T respects common
We may thus conclude that every type t T believes in sequential rationality,all
believes that opponents'
in the events that every
type believes in sequential rationality, satisfies minimal belief revision and initially believes in P. This completes the proof of this theorem. ¥ We now prove that common belief in the events that types believe in sequential rationality, satisfy minimal belief revision, and initially believe in a profile P of preference relations, leads to backward induction in the game with preference relations P. 22
Theorem 5.2. Let S be an extensive formof a profile of preference relations on the set
structure with perfect information, and P = (Pi)i relation Pi, respecting common belief in the events that types believe in sequential rationality, satisfy minimal belief revision, and initially believe in P. Then, there is a unique sequentially rational strategy for ti, namely player i's backward induction strategy in (S,P). Proof. For a given player i, decision node hi Hi and opponent j, let Sj(hi) be the set of strategy sj prescribes the backward induction action a(hj) in (S,P). We prove the following Claim. Let ti be a player i type that respects common belief in the events that types believe in sequential rationality, satisfy minimal belief revision, and initially believe in P. Then, sj(ti,hi) Sj(hi) for all hi HClaim. all opponents j. Proof of We prove the claim by induction on the number of decision nodes following
I
terminal nodes. Let ti be a type with preference
player j strategies sj such that (1) sj Sj(hi),and (2) at every hj property.
Hj(sj) following hi, the
i and
prescribes the backward induction action a( ) shall prove that sj(hj) = a(hj). As ti respects common belief in theevents that types believe Let tj = tj(ti,hi) and sj = sj(ti,hi). Choose a decision node hj
hi. If hi is not followed by any decision node, the statement is trivial since Sj(hi) = Sj(hi). Suppose now that the claim holds for all pairs (i0,j0) of players and every decision node hi0 by exactly K decision nodes. We prove that sj(ti,hi) Sj(hi) for all opponents j. Hence, we followed by at most K - 1 decision nodes. Choose hi with the property that hi is followed must show that for every decision node hj hjH. (sj(ti,hi)) following hi, the strategy sj(ti,hi)
j
Hj(sj) following hi. We
in sequential rationality, satisfy minimal belief revision, and initially believe in P, and since ti believes at hi that player j is of type tj, it follows that tj respects common belief in the events that types believe in sequential rationality, satisfy minimal belief revision, and initially believe in P. Since hj is followed by at most K - 1 decision nodes, we thus know by the induction assumption that sk(tj,hj) Sk(hj) Consequently, tj believes at hj that all opponents choose their backward As ti initially believes in P, it follows that tj(ti,h0) has preference relation Pj. Moreover, since ti satisfies minimal belief revision, it must be the case that tj(ti,h0) has the same conditional belief vector as tj(ti,hi) = tj. We may thus conclude that tj(ti,h0) believes at hj that all opponents choose their backward induction actions in (Srelation theitdecision nodestjfollowing ,P) at hj. Together with the fact that tj(ti,h0) has preference Pj, follows that (ti,h0)'s optimal strategies at hj all prescribe the backward induction action a(hj) at hj. More precisely, for every sj Sj(hj) not prescribing a(hj) at hj there is some s0j Sj(hj) prescribing a(hj) 23
for all opponents k 6=(j.,P induction actions in S
) at the decision nodes following hj.
at hj such that tj(ti,h0) strictly prefers s0j over sj at hj. This, in turn, means that ti believes at h0 that for every sj Splayer) not prescribing as(0jhover hj there. is some s0j Sj(hj) prescribing Since ti believes that all opponents believe in sequential rationality and satisfy minimal belief revision, we know by Lemma 4.4 that ti believes that all opponents satisfy Bayesian updating. Together with the fact that ti satisfies minimal belief revision, we may conclude by Theorem 4.9 that ti satisfies proper belief revision. Therefore, ti's belief at hi about player j's preference player relation at hj over strategies in Sj(hj)Sj(hi) should coincide with ti's belief at h0 aboutfollows strategies in Sj(hj) should coincide with ti's belief at the beginning about player j's preference relation at hj over strategies in Sj(hj). Since we have seen that ti believes at h0 that for every that player j strictly prefers s0j over sj at hj, it follows that ti believes so at hi. This implies, sj Sj(hj) not prescribing a(hj) at hj there is some s0j Sj(hj) prescribing a(hj) at hj such however, that ti believes at hi that player j's optimal strategies at hj all prescribe the backward induction action a(hj) at hj. Since ti believes in sequential rationality, and since sj = sj(ti,hi), we must have that sj is optimal for tj(ti,hi) at hj. By the above, it follows that sj must prescribe the backward induction a(hj) at hj, which was to show. This completes the proof of the claim. Now, let ti be a type that has preference relation Pi, and that respects common belief in the events that types believe in sequential rationality, satisfy minimal belief revision and initially believe in P. By the claim, we know that ti believes at every decision node hi that his opponents will choose the backward induction actions in (S,P) atrationaldecision node followingbackward
j (hj j ) at j
s at hj a(hj) at hj such that j strictly prefers
j's preference relation at hj over strategies in Sj(hj) Sj(hi).player bypreference
hi we have that Sj(hj) Sj(hi). Hence, ti's belief at hi about
Since, j's
assumption, hj
relation over
every induction strategy in (S,P). This completes the proof. ¥ ti has preference relation Pi, the unique sequentially hi. Since
strategy for ti is his
6. Dropping Minimal Belief Revision In the previous section we have seen that common belief in the events that types initially believe in P, believe in sequential rationality and satisfy minimal belief revision, singles out the backward ). In this we investigate how crucial precisely, we study the consequences of replacing the minimal belief revision requirement by the more basic condition of Bayesian updating. We shall prove that the resulting rationality concept allows for any strategy that survives the Dekel-Fudenberg procedure (Dekel and Fudenberg (1990)), that is, one round of elimination of weakly dominated strategies followed by iterative elimination of strongly dominated strategies. As to formally state this result, we must first define the Dekel-Fudenberg procedure in some more detail. of preference relations on the terminal nodes. For every player i, let ui be an arbitrary utility 24
induction strategy for player i in the game (S,Prelationshipsectionbackward "minimal belief revision" is in establishing this with
induction. More
Let S be an extensive form structure with perfect information and P = (Pi)i I a profile
function on the terminal nodes that represents Pi, and let u = (ui)i . Let (Si) be the set of
I
opponents' strategy profiles. For a pair (µi,s-i) (Si) × S-i, we denote by6j probability distributions on the set Si of player i strategies, and let S-i = ×
ui(µi,s-i) =
XsiSi
µi(si) ui(z(si,s-i))
=i j S be the set of
the expected utility induced by (µi,s-i) and the utility function uithat
). We say
dominated with respect to ui if there is some µi ) (Ssome
i
the terminal node reached by the strategy profile (si,s-isuch
)
. Here, z(si,s-i)isdenotes
strategy si weakly
that (1) ui(µi,s-i) ui(si,s-i)
S~-i Sto ofifopponents' strategy fii)les. We sayuithat,ss-inisSistrongly--dominateds-ondominated with respect to ui. For every player i and every k 2, recursively dekfi-ne(uDFwithu)respect setuof. the Dekel-Fudenberg procedure with respect to u if and only if si DFi(u). The following theorem states that replacing minimal belief revision by Bayesian updating in the model leads to the Dekel-Fudenberg procedure. Theorem 6.1. Let S be an extensiveterminalstructureand perfect information, Pi.=Then,thea (1) si is sequentially rational for some type ti having preference relation Pi and respecting common belief in the events that types believe in sequential rationality, satisfy Bayesian updating, and initially believe in P; (2) si survives the Dekel-Fudenberg procedure with respect to every u representing P. Here, we say that u represents P if for every player i it holds that ui represents Pi. In particular, the theorem implies that the Dekel-Fudenberg procedure for generic games with perfect information does not depend upon the particular utility functions that are chosen to represent the preference relations over terminal nodes. Since it is well-known that the DekelFudenberg procedure may select strategies, and even outcomes, that are not compatible with backward induction, the minimal belief revision requirement may be seen as a property that closes the gap between the Dekel-Fudenberg procedure and the backward induction procedure. In Ben-Porath (1997) it has been shown that also the concept of "common certainty of rationality at the beginning of the game" leads exactly to those strategies surviving the DekelFudenberg procedure. The latter concept is, however, built upon fundamentally different principles than ours. Ben-Porath assumes, namely, that players believe throughout the game that the opponents hold preference relations as given by P, while we only assume players to believe so
-i i
respect
For every player i, let DFi (u) be the set of strategies ui
1
there is some µi (S such that (µi i ) > ui(si,s ) for all i i
S~--i. S~ i with
that are not weakly
k i ( as the
strategies in DFi k-1
(u) that are not strongly dominated on ×j DFj 6=i
1 ) to i
Finally, let DFi(u) = k
N
DFi (u) for every player i. We say that a strategy si Si survives k
form following two statements are equivalent: profile of preference relations on the (Pi)i I
nodes,
with si a strategy for player
for all s-i S-i, and (2) ui(µi,s-pro> ui(si,s-i for i ) s-i S . Now, fix some subset i
25
at the beginning of the game. On the other hand, Ben-Porath only requires players to believe at the beginning of the game that his opponents choose sequentially rational strategies, while our "belief in sequential rationality" condition requires players to believe so at each instance of the game. One could therefore argue that the concept of "common certainty of rationality at the beginning of the game" is in some sense dual to the concept studied in this section. Nevertheless, both concepts eventually make the same selection of strategies for each player. As a preparatory step towards proving Theorem 6.1, we first characterize weakly dominated and strongly dominated strategies in games with perfect information. These characterizations are based upon results in Ben-Porath (1997) and Pearce (1984), and are stated in the following lemma.3 In this lemma, we say that a strategy si is initially rational for type ti if si is optimal for ti against the initial belief s-i(ti,h0). Lemma 6.2. Let S be an extensiveutility structurerepresenting information, Pi a preference (1) si is not weakly dominated with respect to ui if and only if si is sequentially rational for some type ti that has preference relation Pi and satisfies Bayesian updating; (2) si is not strongly dominated on some S~-i S-Piwith respectbelief if-andi,h0) ifS~s-iis. initially Proof. (1) Suppose that si is not weakly dominated with respect to ui. By Lemma 4 in
form
relation on the terminal nodes, ui a i. Then the following is true: function
with perfect Pi, and si a strategy for player
i to ui only i
rational for some type ti with preference relation and initial s (t i
Pearce (1984), there is some µ-i (S-1.1withBen-Porath µ-i with respect to ui. Then, by Lemma in
i ) full support such that si is optimal against
(1997), there exists a probabilistic
updating system (µ-ifi(hi))hiHi with µ-i(hi)and(b)-forhevery every ih(si, Hi such that (a) this ) with respect to ui. By Lemma 1.2.1 in Ben-Porath (1997), there then exists a (a) this updating system satisfies-Bayesian updating, and (b)-for every hi satisfyingstrategy si
i i i
H ) strategy si is optimal
updating system satis es Bayesian updating,
against µ-i(hiupdating deterministic
system (s (hi))hiHi with s-i(hi) S (hi) for everyHhi(si)Hi such that
i i i
to i . One can then choose a type ti Bayesian
updating, with preference relation Pi, construction, si is sequentially rational for ti.
is optimal against s-i(hi) with respectanduwith
, s-i(ti,hi) = s-i(hi) for all hi Hi. Hence, by
(S ( )) forhi
Now, suppose that si is sequentially rational for some type ti that has preference relation Pi
s-i(ti,hi) S-i(hi) with respect to ui. By Lemma1.2 Ben-Porath (1997), it then follows Lemma 3 in Pearce (1984), we then know that there is some µ-i (S~-i) such that si is (2) Suppose that si is not strongly dominated on some S~-i S-i with respect to ui. By
that si is optimal with respect to ui against some µ-i with(Srespect -i
Pearce (1984) implies that si is not weakly dominated
) with full support. Lemma 4 in
to ui.
3 Formally, Lemmas 3 and 4 in Pearce (1984), which are the results we use here, are stated for two-player games
and satisfies Bayesian updating. Hence, for every hi
Hi(si), the strategy si is optimal against in
only. However, if one allows for correlated probability distributions on the opponents' strategy spaces, as we do, then Pearce's results also apply to more than two players.
26
optimal against µ
may conclude that-ithere is some s-respect supportLet µi
i
with respect to ui (and hence with to Pi).
and s-i(ti,hfi)nallys-that 0 i
Assume s (ti,h0)
i
=
-i
with respect to ui. By the proof of Lemma 1.2.1 in Ben-Porath (1997), we
such that si is optimal against s-i
t be a type with preference relation Pi
. Then, s-i(ti,h0)rationaland ssomeinitiallyti rational for ti.
i
si is initially
for type
in the S~-i of is
S~-i. Then, si is optimal against s-i(ti,h0) S~
i
with preference relation Pi and with respect to ui. But then, si
lemma. ¥ is-obviouslynot strongly dominated on S~-i with respect to u-i. This completes the proof of this By means of this lemma, we are now able to provide an alternative characterization of ence relations. For every player i, let Si (P) be the set of strategies thatare sequentially rational
strategies that survive the Dekel-Fudenberg procedure. Let P = (Pi)i I be a profile of prefer-
1
for some type ti with preference relation Pi that satisfies Bayesian updating. For k 2, recurti with preference relation Pi, satisfying Bayesian updating, and with initial belief s-i(ti,h0) in a profile of preference relations. Then, Si(P) = DFi(u) for every profile u of utility functions Lemma 6.3. Let S be an extensive form structure with perfect information and P = (Pi)i
sively define Si (P) as the set of strategies in Si
×j Sj -1(P). Finally, let Si(P) = k 6=i
k k-1 (P) that are sequentially rational for some
k
N Si (P). We obtain the following characterization. k
I
representing P. Proof. Choose a profile u of utility functions representing P. We show, by induction on k, that Si (P) = DFi (u) for all players i. By Lemma 6.2, part (1), we know that Si (P) = DFi (u). Now,
k k 1 1 k
let k 2 and assume that Sj -1(P) =kDF).
j k-1
(u) for all players j. We first show that Si (P)
k
DFi (u). Take some arbitrary si Si (P
k Hence, si is sequentially rational for some type ti
with preference relation Pi, satisfying Bayesian updating, and with s-i(ti,h0)timplies 6=i
Since ti satisfies Bayesian updating, the fact that si is sequentially rational for i
×j SjthatPs)i.
k-1
(
is initially rational for ti. By Lemma 6.2, part (2), it follows that si is not strongly dominated
all j = i, it follows that si is not strongly dominated on ×j DFj 6 6=i
the other hand, we know that si (uS) (P) i Si
to DFi k-1
(u). Hence, si DFi
k-1
k i k-1
on ×j Sj -1(P) with respect to ui. Since, by induction assumption,1(S)
6=i
j k-1
k-
k
(P) = DFj
k-1
(u) for
u with respect to ui. On
(P) which, by induction assumption, is equal
and s is not strongly dominated on ×j DFj 6=i
k-1 (u) with
respect to ui, which implies that si SiDFi).(uTake We finally show that DFi (u) 1(u (P
).
strongly dominated on ×j DFj 6=i
k k
k-
k
some arbitrary si DFi (u). Hence, si is not
k
) with respect to ui. By Lemma 6.2, part (2), we then
have that si is initially rational for some type ti with preference relation Pi and s-i,(twe ) have that si is initially rational for some type ti with preference relation Pi and6=s-iSti ,h0)), i
i
×j DFj 6=i k-1 (u). Since, by induction assumption, DFj k-1 (u) = Sj -1(P) for all j k
thus (P it
(
follows that si Si (P). Hence, 6=i i ×j Sj -1(P). As s1 DFi (u) k k
DFi (u) and, by induction assumption, DFi (u) = 1 1 i 1
there is some type t0i with preference relation Pi and satisfying 27
,h0
Bayesian updating, such that si is sequentially rational for t0i. Now, construct a type t00i with the following properties: (1) t00i has preference relation Pi, (2) s-t0i(,hi,hat=alls-other )i at allih. Hi
i t00i i ) i i
for which s-i(ti,h0is S-i(hi), and (3) sfor(tt00i,hSince,smoreover,
-i i -i
by construction, si sequentially rational .
) 00 i ) = ( )
Then,
t00i has preference relation Pi,
(ti,h0 h H
satisfies Bayesian updating and has initial belief s-i(t00ipreference (relationP×,
,h0) = s-i ti,h0) j6=i
Sj-1(P), we thus k
have that si is sequentially rational for a type with
which, by induction assumption, is equal to Si previous insight, we may thus conclude that si the proof. ¥
i
satisfying Bayesian
updating, and with initial belief in ×j Sj -1(Pk)-.1On).the othershand,i si1(PDFTogetherDFi
6=i
k i
(u)
(P
k
Si (P). It thusfollows that Si (P) = DFi (u)
i -
k-1 (u)
Hence, Sk ). with the
for all players i and all k, which implies that SiP) = DFi(u) for all players i. This completes (
k k k
We are now in a position to prove Theorem 6.1. Proof of Theorem 6.1. For every player i, let Ti(P) be the set of player i types that have preference relation Pi and respect common belief in the events that types believe in sequential rationality, satisfy Bayesian updating and initially believe in P. Let Si(P) be the set of player i strategies that are sequentially rational for some type in Ti(P). We first show the implication from (1) to (2). Let u be an arbitrary profile of utility functions
Si(P), and hence it is sufficient to showthat Si(P) latterPclaim
that represents P. We show that Si(P)
DFi(u). By Lemma 6.3 we know that DFi(u) = Si( ), which in turn is equivalent to
showing that Si(P) Sshow)thatevery)k. Wei proveLet
(P for k i
the by induction on k.
si is sequentially rational for some typei satisfying Bayesian updating and having preference relation Pi, and hence si assume).thatmay(Pthus conclude)thatevery)player(Pj.)Chooseplayers i. Now, let k 2, and an arbitrary player i. We prove that Si(P) Si (P). Choose somebelief in (the).events theretypessome lieve in sequential rationality, satisfy Bayesian updating and initially believe in P, such that si is sequentially rational for ti. Fix an opponent j. Then, it follows that strategy sj(ti,h0) is sequentially rational for type tj(ti,h0), and, moreover, type tj(ti,h0) has preference relation Pj and respects common belief in the events that types believe in sequential rationality, satisfy Bayesian updating and initially believe in P. Hence, tj(ti,h0) )Tthe)(.induction(tassumption, sj(ti,h0) Sj -1(P) for all opponents j. Therefore, si is sequentially rational for a type ti
For k = 1, we must
t
Si (P We
1 1
Sj ) Sj -1(P for k
Si(P Si for all
k si Si P is
type ti with preference relation Pi and respecting common
Then, that
be-
j
(P Since sj i ,h0) is
sequentially rational for tj(ti,h0) Tevery),opponent that shence, j
conclude that sj(ti,h0) Sj(P) for
j (P
it follows j and (ti,h0 Sj P). We may thus
by
k
that has preference relation Pi, satisfies Bayesian updating, and has initial belief s-i(ti,h0) in
a subset of Si
×j Sj -1(P)k.-OnPtheItother hand wethat
6=i
k
know that si Si(P) which, by induction assumption, is
si Si (P). By induction, we may thus conclude that 28 k 1 ( ). thus follows
Si(P S1(P). si Si(P). Then, by definition of Si(P),
Si(P) iS(P()P) forthe We now show
Hence, S
k i
every k and every player i, and hence Si(P) Si(P) for every player i. DFi(u) for every player i. The implication from (1) and (2) thus follows.
implication from (2) to (1). Let u be a profile of utility functions that
represents P. We must show that DFi(P) i. that DFi(P) = Si(P), and hence it is suffi Si(P) for all players i. By Lemma 6.3 we know
cient to show that Si(P) Si(P) for every player
relation Pi, satisfying Bayesian updating and having initial belief s-there )an ×j Sj(Psystem By construction of Si(P), we may find for every si Si(P) some,htypeinti with preference i (ti
0
is that si is sequentially rational for ti. Hence, for every si Si(P)
i
ci(si)(h0) ×j Sj(withsuch thattothe(supdatingi.system satisfies Bayesian updating, and si is ci(si) = (ci(si)(hi))hiH, with ci(si)(hi) S-i(hi) for every hi Hi (see Section 4.1) and and preference relation Pi(si), not necessarily equal to Pi, such that ci(si) satisfi×j Bayesian updating and si is sequentially rational with respect to ci(si) and Pi(si). For every si Si(P), simply set Pi(si) = Pi. Now, suppose that these updating systems ci(si) and preference relations Pi(si) have been defined for all players i and strategies si. We may construct for every player i and every strategy si a type ti(si) with the following properties: (a) the preference relation of ti(si) is given by Pi(si), (b) for every hi Hi and opponent j, the conditional belief sj(ti(si),hi) about player j's strategy choice is given by cij(si)(hi), where cij(si)(hi) is the belief at hi about player j's conditional belief tj(ti(si),hi) about player j's type is given by tj(sj), where sj = cij(si)(hi). strategy choice in the updating system ci(si), and (c) for every hi Hi and opponent j, the Claim. Every type ti(si) respects common belief in the events that types believe in sequential rationality, satisfy Bayesian updating and initially believe in P. we have that tj(ti(si),hi) T~tofor every player i,ittypesutiffiscient Tto, decision node hi Hiinand Proof of claim. Define T~ = {ti(si) | i I and si Si}(. iByconstruction of everytypes ti(siT~ the ),
6=i
sequentially rational
P)
respect ci i ) and P
For every si Si(P) we may find some updating system ci(si) with ci(si)(h0) / es6=iSj(P ),
opponent j. Hence, in order
) ~
show the claim, is show that type
6=i
updating
) such
believes in sequential rationality, satisfies Bayesian updating and initially believes in P. believes that opponent j chooses strategy sj = cij(si)(hi) and believes thatopponent j has type tj(sj). By construction, type tj(sj)'s conditional belief about the opponents' strategy choices is given by cj(sj). Since sj is sequentially rational for cj(sj), it follows that sj is sequentially rational for tj(sj). It therefore follows that the strategy sj(ti(si),hi) = sj is sequentially for tj(ti(si),hi) = tj(sj) for every hi Hi and every opponent j, which implies that ti(si) believes in sequential rationality. Bayesian updating. Bayesian updating of ti(si) T~ follows immediately from the fact that ti(si)'s conditional beliefs about the opponents' strategy choices is given by ci(si), and the assumption that ci(si) satisfies Bayesian updating. Initial belief in P. Let ti(si) be a type in T. Fix an opponent j. By definition, we have ~ 29
Belief in sequential rationality. Let ti(si) be a type in T. At every hi ~ Hi, the type ti(si)
that sj(ti(si),h0) = cij(si)(h0). Since, by construction, ci(si)(h0) chooses (P), we have that Sj(P). Assuch, ti(si) initially believes that player j has type tj(sj), which has preference player j has preference relation Pj. Since this holds for all opponents j, it follows that ti(si) We have thus shown that every type ti(si) in T~ believes in sequential rationality, satisfies Bayesian updating, and initially believes in P. This implies the statement in the claim. Then, si is sequentially rational for the type ti(si), and ti(si) has preference relationPi(si) = Recall that it is our objective to show that Si(P) Si(P). Take some arbitrary si Si(P). Pi. By the claim above, it follows that ti(si) has preference relation Pi and respects common belief in the events that types believe in sequential rationality, satisfy Bayesian updating and initially believe in P. Hence, ti(si) TiSPi)(.PSinceSsi(is sequentially rational for tiiwe)haveithat) forall players i. This establishes the implicationfrom (2) to (1), and completes the proof of this si Si(P). We thus have shown that ) P), which implies that DF (u S (P theorem. ¥ 7. Appendix Proof of Lemma 4.8. Let u1 be an arbitrary utility representation of P1, and let the utility functions u2 and u2 be as stated in the lemma. Let D(P1,P2) be the set of unordered pairs of ~ terminal nodes on which P1 and P2 disagree. Similarly, we define D(P1,P~2). Without loss of generality, let a and b in the lemma be chosen such that u1(a) > u1(b). Then, by construction, u2(a) < u2(b) and u2(a) > u2(b). We prove our result through a series of smaller facts. The ~ ~ proof for each of these facts is given in the lines immediately following the statement of the fact. This follows directly from}the observation that{a,b(}a> u1(b), u2(a) > u2(b) but u2(a) < u2(b).
cij(si)(h0) Sj(P). Hence, ti(si) initially believes that player j 6=i strategy sj
relation Pj(sj) = Pj, since sj initially believes in P. Sj. We may thus conclude that ti(si) initially believes that
( i ,
Fact 1. It holds that {a,b D(P1,P~2), but /
u1 )
D(P1,P2). ~
~
Fact 2. Let {directly D(P1,P~2)observationthat }~.2(Then,u{2x,y)}andDu(2P(y,P= )u.2(y x,y}
This follows from the
, and x,y / {a,bu 1 2
x) = (x ~ ) ).
×j Sjsome
Since {by }{ D}(P1,P~of)u2and uknow< u2(uy2)(,awe=mustb have uthat)u=(ua2)(y>).uSince. On(ythe other) and u2(a) > u2(b), it follows that u2(a) = u2(b) < u2(y) = u2(y), which implies that {a,y}
Fact 3. Let a,y a,y
hand,
2 1 (y)
construction
~2(a) ~ , we ~ that
) u2( ) and ~ 2 (y ~ u2 ) > u2(a ~ ~
~ D(P1,P2). ~
D 1
~ ~2 ) ~ u2 ) > ~2(b ~
1 ,P2).
D(P1,P~2) such that u2(y) > u2(a). Then, {a,y1} ~ ~ D(P1,P2).
u1(a) > u1}(b), and hence u1(b) < u1(y). By definition of u2, we have that u2(b) = u2(a) and Fact 4. Let {a,y(}P1,P~(2P) ,P~2)u2such>that(yu),(awe>must(yhaveuthat).uThen,<{b,y(}y . D(Passumption, u2(y) = u2(y). Since u2(a) > u2(y), we have that u2(b) > u2(y), which implies that {b,y} Fact 5. Let {a,y} D(P1,P~2) such that u2(y) < u2(b). Then, {a,y} D(P1,P2). ~ ~ 30
~ ~
~ D(P1,P2). ~ ~
Since {a,y D and ~ (a) u2 ~ 1 (a) u1 ) By
As u2(y) < u2(b) and u2(a) > u2(b), we may conclude that u2(a) > u2(y). Since {= }(y)
~ ~ ~ ~ ~ ~ D(P1,P~2) we must have that u1(a) < u1(y). By definition of u2, it is seen that u2(y) ~ ~ ~
a,y u2 ~
Fact 6. Let {ub,ya}) u2(y), it follows that u2(a) > u2(y), and hence {1a,y2}). D(P1,P2). definition of u2, it holds that u2(b{) = u2a) and u2(y) = u2(y). Since u2(a) < u2(y), we know
~ (
As u2(b) < ~2( ~ ~ ~ ) and
that u2(y) > u2(a). Then, {b,y} thatPu,Pb ~ ~ D(
}~ 1
(
D(P1,P~2), we must have ~
~
Fact 7. Let b,y D(P1,P~2){such} that u2(a) > u2(y) > u2(b). Then, a,y
~
that u2(b) < u2(y), and hence b,y D(P1,P2).~ ~
( ) > u1(y). By ~ D(P1,P2).
u1(b), it follows that u1(a}) > u1(y). On the other hand, we know by definition of u2 that As u2(b) < u{2(y)} and {b,y D(P1,P~2), we may conclude that u1(b){> u1}(y). Since u~ (a) > u2(a) = u2(b) and u2(y) = u2(y). As u2(b) < u2(y), it follows that u2(a) < u2(y), and hence
~ ~ {Fact}8.Let(P{b,y2}(y).)
~ ~ ~
a,y D 1 ,P2
~ 1
D(P1,P~2) such that u2(y) < u2(b). Then, {b,ythat D1((Pb ,P2u).1(y
~ ~ 1
D(P1,P~2), it must be the case
}
Since u2(b) > u
struction of u2, it holds that u2(b) = u2(a) and u2(y) = u2(y). As u2(a) > u2(b) > u2(y), we ~
~ ~ ~ ~
have that u2(b) > u2(y), and hence {b,y(P1,PD2()Pcontains }
From Facts 1 to 8, it follows that D ~
1 ,P2).
strictly less pairs than D(P1,P2), and
~
and {b,y} u ) < ~ ). By con~
hence d(P1,P~2) < d(P1,P2). This completes the proof. ¥
References [1] Asheim, G.B. (2002), On the epistemic foundation for backward induction, Mathematical Social Sciences 44, 121-144. [2] Asheim, G.B. and A. Perea (2004), Sequential and quasi-perfect rationalizability in extensive games, Forthcoming in Games and Economic Behavior. [3] Aumann, R. (1995), Backward induction and common knowledge of rationality, Games and Economic Behavior 8, 6-19. [4] Balkenborg, D. and E. Winter (1997), A necessary and sufficient epistemic condition for playing backward induction, Journal of Mathematical Economics 27, 325-345. [5] Battigalli, P. and M. Siniscalchi (1999), Hierarchies of conditional beliefs, and interactive epistemology in dynamic games, Journal of Economic Theory 88, 188-230. [6] Ben-Porath, E. (1997), Rationality, Nash equilibrium and backwards induction in perfectinformation games, Review of Economic Studies 64, 23-46. [7] Dekel, E. and Fudenberg, D. (1990), Rational behavior with payoff uncertainty, Journal of Economic Theory 52, 243-267. 31
[8] Ha, V. and Haddawy, P. (1998), Towards case-based preference elicitation: Similarity measures on preference structures, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 193-201. [9] Pearce, D.G. (1984), Rationalizable strategic behavior and the problem of perfection, Econometrica 52, 1029-1050. [10] Perea, A. (2002), A note on the one-deviation property in extensive form games, Games and Economic Behavior 40, 322-338. [11] Perea, A. (2003a), Forward induction and the minimum revision principle, Maastricht University. [12] Perea, A. (2003b), Rationalizability and minimal complexity in dynamic games, Maastricht University. [13] Perea, A. (2004), Proper rationalizability and belief revision in dynamic games, Maastricht University. [14] Rubinstein, A. (1991), Comments on the interpretation of game theory, Econometrica 59, 909-924. [15] Samet, D. (1996), Hypothetical knowledge and games with perfect information, Games and Economic Behavior 17, 230-251. [16] Schulte, O. (2002), Minimal belief change, Pareto-optimality and logical consequence, Economic Theory 19, 105-144. [17] Stalnaker, R. (1998), Belief revision in games: forward and backward induction, Mathematical Social Sciences 36, 31-56.
32