Price of anarchy

From Theory
Jump to: navigation, search

Notes for CS 8803 - Game Theory and Computer Science. Spring 2008

Scribe: Fabio Cunial

Given a game <math>G=(N,A,u)</math>, it is natural to consider the social welfare. There are many possible social welfare functions. We will consider both the utilitarian function,

<math>w(a) = \sum_{i=1}^N u_i(a)</math>,

and the egalitarian function,

<math>w(a) = \min_i u_i(a)</math>.

We will later come to other functions, when we discuss the bargaining problem. The social optimimum maximizes <math>w(a)</math> over all possible action profiles, <math>a \in A</math>.

The social optimimum is a nice notion, but it is generally not (individually) rational for people to play it. For example, in the prisoner's dilemma, the socially optimum play is for both players to cooperate, yet defecting is a dominant strategy.

The price of anarchy is a measure of how well people do when they play selfishly (Nash equilibrium) instead of according to a central authority (social optimum). It is defined to be, the ratio of the social optimum welfare to the welfare of the worst Nash equilibrium. That is, it is the ratio of the largest social welfare achievable at any Nash equilibrium to the smallest social welfare achieved at any Nash equilibrium. In many cases, we will consider costs instead of utilities.

The pure price of anarchy considers only pure Nash equilibria, i.e., Nash equilibria in which all players play pure strategies (no randomization). Of course, this may not always exist. The mixed price of anarchy, defined above, considers mixed strategies. Of course, the price of anarchy also depends on our choice of social welfare functions.

The term price of anarchy (PoA) was invented by Papadimitriou (yes, the PPAD guy) and has been recognized by computer scientists as an important game theoretic notion. Although it is not inherently a computer science term, it is one of the recent contributions of computer science to game theory. While game theorists certainly considered games where PoA=1, they did not have a metric for equilibrium quality. We will also refer to the PoA of an equilibrium, meaning the ratio of the social optimimum welfare to the social welfare achieved at that equilibrium.

PoA in Job Scheduling

Our first example will be job scheduling. In this setting, there are <math>N</math> players (jobs) and <math>M</math> machines. Each machine has a speed <math>s_1,\ldots,s_M>0</math>. Each job has a weight <math>w_1,\ldots,w_N>0</math>. Each player chooses a machine to run her job on. Hence, the actions of each player are <math>A_i=\{1,2,\ldots,M\}</math>.

Define the load on machine <math>j</math> to be:

<math>L_j(a)=\frac{\sum_{i:a_i=j} w_i}{s_j}</math>.

The utility for player <math>i</math> is <math>u_i(a)=-L_{a_i}(a)</math>, i.e., the negative load of the machine she chose. For notational ease, we will define costs <math>c_i=-u_i</math> so we can talk about positive costs instead of negative utilities. So <math>c_i(a)=L_{a_i}(a)</math>. In this case, we will consider the egalitarian welfare, i.e., try to minimize the maximum load. This quantity <math>\mbox{MS}(a)=\max_j L_j(a)</math> is called the makespan.

For this problem, which involves costs, we will consider the PoA to be the ratio of the largest MS for any Nash equilibria to the smallest possible MS. It should be clear that mixed PoA <math>\geq</math> pure PoA, because any pure Nash equilibrium is also a mixed Nash equilibrium (this inequality can be strict: e.g. when <math>N=2</math>, <math>w_1=w_2=1</math>, <math>M=2</math>, and <math>s_1=s_2=1</math>, the mixed strategies <math>\sigma_1=\sigma_2=(1/2,1/2)</math> achieve an average makespan of 1.5, while any pure-strategy PoA in this setting is <math>\leq 4/3</math>). First we need to argue that there exist pure Nash equilibria.

Claim. For each job scheduling game, there exists at least one pure-strategy Nash equilibrium.

Proof. We would like to take a socially optimal action profile <math>a^*</math>. This would mean simply an action profile whose makespan is minimum. However, this will not be enough. There may be several such action profiles leading to a variety of different loads distributions (all having the same maximum load). Among these, we further restrict ourselves to one that has a minimum second-largest load. Again, this results in a set of possible load distributions, and we repeat until the <math>M</math>th-largest (i.e., smallest) load, where there can only be one distribution of loads (unique up to permutation). This would also be called the lexicographic smallest sorted load vector.

We claim that this is a pure-strategy nash equilibrium. Suppose not. Suppose that some player <math>i</math> could strictly improve by moving from machine <math>j</math> to machine <math>k</math>. This means that the larger of the two loads must go down, because <math>i</math> must have been using the machine with larger load in both cases. But this violates the lexicographic minimality of <math>a</math>. Q.E.D.

Claim. For each job scheduling game, the pure PoA is at most <math>M</math>.

Proof. It is easy to upper-bound the welfare obtained at any mixed-strategy Nash equilibrium <math>\sigma</math> by

<math>w(\sigma) \leq \frac{\sum_i{w_i}}{\max_j{s_j}}</math>.

Consider, for clarity of exposition, any pure-strategy action profile <math>a</math>: clearly

<math>w(a) \geq \frac{\sum_i{w_i}}{\sum_j{s_j}} \geq \frac{\sum_i{w_i}}{M \cdot \max_j{s_j}}</math>.

Since the above holds for the social optimum as well, correlating the ratios <math>w(\sigma)</math> and <math>w(a)</math> proves the claim. Q.E.D

PoA in Routing

Braess' paradox

Consider a road network in which a fixed number of drivers need to move from a common source to a common destination; assume that each driver chooses its route selfishly, and that the time to traverse a road depends linearly on the number of drivers choosing that road. We can formalize this setting as a routing problem in a directed, connected graph <math>G=(V, E)</math>, in which we want to send one unit of flow from a source node <math>s \in V</math> to a destination node <math>t \in V</math> (imagine the flow to be composed of the travel decisions of the different drivers). In particular, let the flow be a function <math>f: E \mapsto \Re</math> assigning to each edge a non-negative real number, and consider the set of linear functions <math>L = \{ l_e(f_e) = a \cdot f_e + b \; | \; e \in E, \; a \geq 0, \; b \geq 0\}</math> that map the flow traversing each edge to the latency to traverse the edge. Let's also define the social welfare of a flow <math>f</math> as <math>w(f) = \sum_e{f_e \cdot l_e(f_e)}</math>.


Consider the example in the figure: if the dashed road is not available, the mixed-strategy Nash equilibrium happens when each player chooses the top route and the bottom route with the same probability: this equilibrium has social welfare 1.5, and it takes 1.5 units of time to each driver to go from <math>s</math> to <math>t</math>. Hoping to improve the performance of the network, a legislator could decide to make the dashed, low-latency edge available to the drivers: in this case, the only Nash equilibrium would happen when every driver uses the new road, therefore the social welfare would increase to 2 and now it would take 2 units of time to each player to go from <math>s</math> to <math>t</math>.

Generalized routing problem

The routing problem introduced in the Braess' paradox can be generalized to many different flows traversing the same graph at the same time.

Definition (Generalized flow). Let <math>G=(V, E)</math>, <math>L</math> and <math>w</math> be as defined above, and suppose that we want to route the quantities <math>R = \{ r_1, r_2, \dots, r_k, \; | \; r_i > 0\}</math> through each distinct pair of nodes in <math>\Gamma = \{(s_1,t_1), (s_2,t_2), \dots, (s_k,t_k) \} \subseteq (V \times V)</math>. A flow <math>f_{\Gamma, R}</math> is defined as an assignment <math>p \mapsto \Re</math> of a real, nonnegative number to each path <math>p</math> going from <math>s_i</math> to <math>t_i</math> <math>\in \Gamma</math>, with the constraint that

<math>\sum_{p: \, s_i \rightarrow t_i}{f_p} = r_i \; \; \forall (s_i,t_i) \in \Gamma</math>.

The flow traversing a specific edge of <math>G</math> is defined as

<math>f_{e,\Gamma, R}=\sum_{p: \, e \in p}{f_p}</math>.

For succinctness, we write <math>f_e</math> when <math>\Gamma,R</math> are clear from context.

Definition (Nash-equilibrium flow). A flow <math>f_{\Gamma, R}</math> is a Nash-equilibrium flow iff <math>\forall (s_i, t_i) \in \Gamma</math> and <math>\forall p, q</math> from <math>s_i</math> to <math>t_i</math>,

<math>f_{p}>0 \Rightarrow \sum_{e \in p}{l_e(f_e)} \leq \sum_{e \in q}{l_e(f_e)}</math>.

This definition is closely related to what we said about the support of mixed-strategy Nash equilibria in normal-form games.

Definition (Conditional welfare of a flow). Let <math>f_{\Gamma, R}</math> and <math>f_{\Gamma, R}^{*}</math> be two flows in <math>G</math> associated with the same sets <math>\Gamma</math> and <math>R</math>. In what follows, we will drop the subscript to make the notation clearer. Assume to fix the latencies induced by <math>f</math> on the graph: the conditional welfare of <math>f^{*}</math> with respect to <math>f</math> is defined as

<math>w^{f}(f^{*}) = \sum_{e \in E}{f^{*}_e \cdot l_{e}(f_{e})}</math>.

Fact 1. Given a Nash-equilibrium flow <math>f</math> and any other flow <math>f^{*}</math>, <math>w(f) = w^{f}(f) \leq w^{f}(f^{*})</math>.

Proof (By contradiction). Assume that <math>w^{f}(f^{*}) < w^{f}(f)</math>. By definition,

<math>\sum_{i=1}^{k} \sum_{p: s_i \rightarrow t_i} f_p^{*} \cdot \sum_{e \in p} l_e(f_e) < \sum_{i=1}^{k} \sum_{p: s_i \rightarrow t_i} f_p \cdot \sum_{e \in p} l_e(f_e)</math>.

Since <math>f</math> and <math>f^{*}</math> are associated with the same sets <math>\Gamma, R</math>, we know that

<math>\sum_{p: s_i \rightarrow t_i}f_p = \sum_{p: s_i \rightarrow t_i} f_p^{*} = r_i \; \; \forall i</math>.

Therefore, there must be a pair <math>(s_i, t_i)</math> and two paths <math>p, q</math> from <math>s_i</math> to <math>t_i</math> such that <math>f_p^{*} > f_p</math>, <math>f_q^{*} < f_q</math>, and

<math>\sum_{e \in p}l_e(f_e) < \sum_{e \in q}l_e(f_e)</math>.

In other words, the flow <math>f^{*}</math> can achieve a lower welfare than <math>f</math> only if there are two paths from <math>s_i</math> to <math>t_i</math> having different costs, and if <math>f^{*}</math> reroutes some flow of <math>f</math> from the higher-cost path to the lower-cost path. This situation is clearly incompatible with the assumption that <math>f</math> is a Nash-equilibrium flow. Q.E.D.

Note that Fact 1 does not assume any particular structure on the set <math>L</math>.

Fact 2. Given any two real numbers <math>x</math> and <math>y</math>, <math>x \cdot y \leq x^2 + y^{2}/4</math>.

Proof. This is another way to express the true inequality <math>(x-y/2)^2 \geq 0</math>. Q.E.D.

Theorem. The pure PoA of any generalized routing problem <math>(G, L)</math> with linear latencies is <math>\leq 4/3</math>.

Proof. Note that this theorem is equivalent to saying that for each Nash-equilibrium flow <math>f</math>, <math>w(f) \leq (4/3) \cdot \min_{f^{*}} \{ w(f^{*}) \}</math>, where <math>f^{*}</math> is any other flow. By definition,

<math>w^{f}(f^{*}) = \sum_{e \in E} f_e^{*}(a_e \cdot f_e + b_e)</math>
<math>= \sum_{e}(a_{e}f_{e}f_{e}^{*}) + \sum_{e \in E}f_e^{*}b_e</math>.

By using Fact 2, we have that

<math>w^{f}(f^{*}) \leq \sum_{e \in E} \left( a_e \cdot \left( (f_e^{*})^2 + (f_e)^{2}/4 \right) \right) + \sum_{e \in E} f_e^{*} \cdot b_e</math>
<math>= \left( \sum_{e \in E} a_e(f_e^{*})^2 + f_e^{*}b_e \right) + \sum_{e \in E} a_{e}(f_e)^{2}/4</math>
<math>\leq w(f^{*}) + \frac{w(f)}{4}</math>,


<math>(1/4) \cdot w(f) = (1/4) \cdot \sum_{e \in E}f_e(a_{e}f_{e}+b_{e})</math>
<math>= (1/4) \cdot \sum_{e \in E}(f_{e})^2 + \underbrace{(1/4) \cdot \sum_{e \in E}f_{e}b_{e}}_{\geq 0}</math>.

We can conclude that <math>w^{f}(f^{*}) \leq w(f^{*}) + w(f)/4</math>, and prove the thesis using Fact 1. Q.E.D.

Note that in the proof we have made extensive use of the assumption that the functions in <math>L</math> are linear. Actually, a more general fact holds.

Theorem. Given a generalized routing problem with graph <math>G</math> and nonnegative, nondecreasing, polynomial latency functions of degree <math>d</math>, the pure PoA is <math>\leq d+1</math>.

Note that the PoA can grow with <math>d</math>. Consider the example shown in the following figure, where we assume unit flow: the Nash-equilibrium flows have social welfare 1; however, the best welfare is achieved when <math>x=1-1/{\sqrt{d+1}}</math>, in which case

<math>w = \left( 1-\frac{1}{\sqrt{d+1}} \right)^d \cdot \left( 1-\frac{1}{\sqrt{d+1}} \right) + 1 \cdot \frac{1}{\sqrt{d+1}}</math>
<math>=\left(\left( 1-\frac{1}{\sqrt{d+1}} \right)^{\sqrt{d+1}}\right)^\sqrt{d+1}+\frac{1}{\sqrt{d+1}}</math>
<math>\leq e^{-\sqrt{d+1}} + \frac{1}{\sqrt{d+1}}</math>.

This quantity tends to zero when <math>d</math> tends to infinity.


The production of this material was supported in part by NSF award SES-0734780.