[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Some references on factored MDP approaches
- To: Rao Kambhampati <rao@asu.edu>
- Subject: Some references on factored MDP approaches
- From: Subbarao Kambhampati <rao@asu.edu>
- Date: Mon, 1 Oct 2012 20:25:03 -0700
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=J1zOk5BV71pRDhhQ6MV8HyWquWtFqrcld5N9tppn/TY=; b=UZuK2yNh+QOP/muIutmKKN7sddIeOIH6dTkaQzI0nzc+XvCqdTbgkeNGTefca4SVDf pF3EboeRqTG1GiiUbttu1n/2/hk+HF2BHFkVCmuxju+CL2eXqzxIfVOUs/6kbrjmXBXE A+kzs8cAJBi9c1+1rABJXMCBLlTuasMG/UEV1FeV0wmqH4W+0wwxGx1X0mWv2vxdIolY jCxuaKAj62xM51qDZ5epXgRXfXeLWhOxsy7+hbf2TWTTlKs9FmU/E9K/uM+jsNu8ytTh m0q2JCkuwMzLEaLtUhmGOtRLznNxQYkinPnKI0Y08PeOc2fegkL5SRif1ou3WHsS+pdc 3u/g==
- Sender: subbarao2z2@gmail.com
Folks:
As promised, here are references on factored MDP approaches:
1. ADDs: Scott Sanner has done a lot of recent work on using ADDs. Here is a link to a 2011 tutorial he gave on the state of the art in ADD-based approaches to MDPs
This is probably the best place to start if you are interested in getting the background on decision-diagram manipulation routines. He also talks about several variants of ADDs (including Affine ADDs)
The specific system that I mentioned in the class--SPUDD- is available at the following link. The page gives both the
1999 paper which describes how to directly do value iteration over ADDs, and their code for running SPUDD.
2. Additive representations: Here is a paper (that won the JAIR 5-year best-paper award) that describes a additive feature value function representation (they use the term "basis functions" for features) for MDPs (and approximate linear programming approaches for doing policy computation):
==Extension of 1 and 2:
Extension of 1: First order representations: Here is a paper that uses first-order logic representations for states and value functions (and does Bellman updates in terms of regression functions)
and if you like the above, then consider the first-order decision diagrams (which combines 1 and 2 above):
Extention of 2: Finally, here is another quixotic effort called "protovalue functions" that aims to automatically learn basis functions (read "Features") that are tailored to the given problem (as against going with one high level representation across all problems:
cheers
Rao