Rational learning
From Theory
Scribed in part: ?? , Jingu Kim
Notes for CS 8803 - Game Theory and Computer Science. Spring 2008
Bayesian Repeated Games
We consider a repeated game with the following characteristics:
- N: Player set.
-
: Action set.
-
: History set, where
(t times.)
-
: Utility function.
-
: Vector of discounting factors.
-
: Player i's prior distribution for player j,
.
-
: Pure strategy.
-
: Mixed strategy.
The situation has two caveats:
i) Player i knows only A and ui.
ii) Perfect monitoring: For each period t, player i picks
based (only) on
, A and ui. He then gets total utility
.
Example: Repeated Prisoner's Dilemma
In the repeated prisoner's dilemma, for some
we define the grim trigger strategy, given by
.
Then (gd,gd) is a Nash equilibrium for the repeated game, for any
and λi close enough to 1.
Theorem. Bayesian equilibrium α, such that after τ0 after τ0 it is ε-close to a ε-Nash equilibrium of repeated game with known types.
|
Intuition
Consider N=2, pure strategy Bayesian eq. s=(s1,s2). Ti finite.
Subjective equilibrium
Let sij be player i's belief about player j's strategy.
-
- sii is a best response given si( − i)
- sii(hτ) = sji(hτ) on play path hτ
h1 = φ
hτ + 1 = (hτ,(s11(hτ),s22(hτ)))
- sij (
) is a subjective equilibrium
is a Nash equilibrium of repeated game.
Bayesian equilibrium
such that after 