Decentralized partially observable Markov decision process
The decentralized partially observable Markov decision process is a model for coordination and decision-making among multiple agents. It is a probabilistic model that can consider uncertainty in outcomes, sensors and communication.
It is a generalization of a Markov decision process and a partially observable Markov decision process to consider multiple decentralized agents.
Definition
Formal definition
A Dec-POMDP is a 7-tuple, where- is a set of states,
- is a set of actions for agent, with is the set of joint actions,
- is a set of conditional transition probabilities between states,,
- is the reward function.
- is a set of observations for agent, with is the set of joint observations,
- is a set of conditional observation probabilities, and
- is the discount factor.