## Graduate Theses and Dissertations

#### Title

Algorithms for simple stochastic games

2009

Thesis

M.S.C.S.

Computer Science

#### Major Professor

Rahul Tripathi, Ph.D.

#### Keywords

Game theory, Optimal strategies, Algorithms, Computational complexity, Computational equilibrium

#### Abstract

A simple stochastic game (SSG) is a game defined on a directed multigraph and played between players MAX and MIN. Both players have control over disjoint subsets of vertices: player MAX controls a subset V[subfield MAX] and player MIN controls a subset V[subfield MIN] of vertices. The remaining vertices fall into either V[subfield AVE], a subset of vertices that support stochastic transitions, or SINK, a subset of vertices that have zero outdegree and are associated with a payo in the range [0; 1]. The game starts by placing a token on a designated start vertex. The token is moved from its current vertex position to a neighboring one according to certain rules. A fixed strategy s of player MAX determines where to place the token when the token is at a vertex of V[subfield MAX]. Likewise, a t strategy of player MIN determines where to place the token when the token is at a vertex of V[subfield MIN] .

When the token is at a vertex of V[subfield AVE, the token is moved to a uniformly at random chosen neighbor. The game stops when the token arrives on a SINK vertex; at this point, player MAX gets the payo associated with the SINK vertex. A fundamental question related to SSGs is the SSG value problem: Given a SSG G, is there a strategy of player MAX that gives him an expected payoff at least 1/2 regardless of the strategy of player MIN? This problem is among the rare natural combinatorial problems that belong to the class NP ? coNP but for which there is no known polynomial-time algorithm. In this thesis, we survey known algorithms for the SSG value problem and characterize them into four groups of algorithms: iterative approximation, strategy improvement, mathematical programming, and randomized algorithms.

We obtain two new algorithmic results: Our first result is an improved worst-case, upper bound on the number of iterations required by the Homan-Karp strategy improvement algorithm. Our second result is a randomized Las Vegas strategy improvement algorithm whose expected running time is O(2 [superscript 0:78n]).

COinS