Mean payoff game
In game theory, a mean payoff game is a zero-sum game played on the vertices of a weighted directed graph. The game is played as follows: at the start of the game, a token is placed on one of the vertices of the graph. Each vertex is assigned to either the Maximizer of the Minimizer. The player that controls the current vertex the token is on may choose one outgoing edge along which the token moves next. In doing so, the Minimizer pays the Maximizer the number that is on the edge. Then, again, the player controlling the next vertex the token gets can choose where it goes, and this continues indefinitely. The objective for the Maximizer is to maximize their long term average payoff, and the Minimizer has the opposite objective.
Formal definition
A mean payoff game consists of a graph, and a function where is the set of vertices, which are partitioned between the players, and where is the weight of an edge. Often, the graph is assumed to be sinkless, which means that every vertex has at least one outgoing edge. A play is a possible outcome of the game, which is an infinite walk on the graph, we could write this as a sequence of edges: where the head of equals the tail of. The objective value of the game can then be written as follows:A strategy for the Maximizer is a function, where is the set of finite walks that start at the initial vertex and end at some vertex, which returns an outgoing edge of the end vertex. A strategy for the Minimizer can be defined analogously. If both players fix a strategy, say they pick strategies and, then the outcome of the game is fixed, and the resulting play is the path.
One of the fundamental results for mean payoff games is that they are positionally determined. This means in our case that the game has a unique value, and that each player has a strategy that can attain the value, and that strategy is positional, e.g. it only depends on the current vertex the token is on. In formulas, the following equation holds for the value :
Solving mean payoff games
Solving a mean payoff game can mean several things, although in practice finding one often also yields the other:- Determining the value of one or all starting vertices
- Determining the optimal strategy for one or both players
- Determining the set of starting vertices from which the Maximizer can guarantee a value of at least 0. Doing so is called finding the zero-mean partition
Three of the most well-known algorithms for solving mean payoff games are the following :
- The GKK algorithm. Its main idea is to add a potential to every vertex in the graph, and slowly increase the potential on some nodes until we find the zero-mean partition. This also gives us the energy values in an energy game.
- Value iteration. Its main idea is to give each vertex a value, and update the values via fixed point iteration. When the fixed point is reached, the zero-mean partition and energy values can be found.
- Strategy improvement. Its main idea is to start with an arbitrary Maximizer strategy, and assign it a valuation. Then, by repeatedly making changes to the strategy that improve its valuation, we end up with a strategy for the Maximizer than guarantees nonzero payoff wherever it is possible.