Over the last few decades, Markov Decision Processes (MDPs) have been used as the basic semantics for optimal planning for decision-theoretic agents in stochastic environments. Factored MDPs are an approach to represent large MDPs compactly by exploiting the internal structure. However, even when a large MDP can be represented compactly by using a factored representation, solving it exactly is still intractable. In this project, we propose a semi-infinite method to solve the factored MDPs. A central element of our method is a novel linearization technique, which reformulates an integer program (IP) to a provably equivalent, small-size mixed-integer linear program (MILP). Our numerical experiments so far indicate that we can solve problems multiple times larger than the state of the art. Besides, we take the uncertainty of transition kernel in factored MDPs into account and apply the same method to solve it. Numerical results show our proposed method still works for the robust factored MDPs.
May 12th, 2021
9:00 ~ 11:00
Huikang Liu, Imperial College London
Tencent ID：915 827 775