Lecture Notes for CSE571 (F12)
Lecture notes:
Go here for the Fall 2013 Schedule

 L1 Lecture
of [Aug 27, 2012]
Introduction
 L2 Lecture of [Aug 29, 2012]
A bit more introduction. Start of MDPs.

 L3 Lecture of [Sep 5, 2012]
Solving finite horizon MDPs
 L4 Lecture of
of [Sep 10, 2012]
Infinite Horizon MDPs and Value Iteration
 L5 Lecture
of [Sep 14, 2012] (makeup for Sep 12)
Policy iteration for Infinite Horizon MDPs, why it works. Indefinite
Horizon MDPs. Stochastic shortest path MDPs, understanding them as
general cases of infinite horizon and finite horizon MDPs.

 L6 Lecture
of [Sep 17, 2012]
Reinforcement Learning start. Dimensions of variation. Passive
RL. Active RL. Exploration/Exploitation tradeoff
 L7 Lecture
of [Sep 19, 2012]
RL Contd. Exploration functions. Discussion on modelfree vs. modelbased. Passive modelfree
learning. Temporal difference learning. COmparing TD to
montecarlo. Understanding the spectrum of DP
vs. samplebased and deep vs. shallow dimensions of RL
backups. A long discussion on how RL methods can also be
seen as methods for approximately solving MDPs with
*known* models. Active modelfree learning, and the
importance of qvalues. Qlearning.

 L8 Lecture
of [Sep 24, 2012]
Overview of directions for efficiently solving largescale MDPs (for
30min). LRTDP.
 L9 Lecture
of [Sep 26, 2012]
Discussion on LRTDP "alloutcome determinization" heuristic. UCT
approach and where and when it works. Determinizationbased
approaches and FFReplan.
 L10 Lecture
of [Oct 1st, 2012]
Discussion of FFHOP; Discussion of factored MDP approaches.

 L11 Lecture
of [Oct 3rd, 2012]
Discussion of dynamics, goaltypes and observability
dimensions. Discussion that decisionmaking is hardest with
partial knowledge of dynamics and/or observability. The belief
state representation. The
observation model and how it filters the belief state during
execution, and *partitions* it during search. How plans are
sequences with no observability and conditional with partial
observability or stochastic actions. How partial observability
brings to fore the problem of state estimation (and the
related problems of prediction and smoothing).
 L12 Lecture
of [Oct 8th, 2012]
Recap of the discussion on handling nondeterministic actions and
partial observability in terms of belief states. Formal start of
POMDP model.
 L13 Lecture
of [Oct 10th, 2012]
POMDP model. Acting with it. Deriving its transfer
function. Approximate methods (based on online lookup) for
POMDP. POMDP relaxations. A quick overview of POMDP value
iteration algorithm and where its horrendous complexity
comes from.
 L14 Lecture
of [Oct 17th, 2012]
Discussion of finite horizon POMDP Value Iteration. Factored methods
for beliefspace planning.



L15 Lecture
of [Oct 22nd, 2012]
Short discussion of nondeterministic propostional conditional
planning (15min), followed by start of constraint
satisfaction problems.

L16 Lecture
of [Oct 24th, 2012] Quick overview of CSP search algorithms. Backtracking search. Variable
and value ordering heuristics. Forward
Checking. Explanationbased learning (of nogoods) and
intelligent backtracking. Iterative (local search)
algorithms for CSP.

L17 Lecture
of [Oct 29th, 2012] Preprocessing of CSP problems. Consistency enforcement. Kconsistency;
strong Kconsistency; directional
Kconsistency. Connection between induced tree
width and directional Kconsistency.
 L18 Lecture
of [Oct 31st, 2012] A bit more on directional consistency and induced width. Main topic on
compiling boundedlength planning problems into SAT
problems.


L19 Lecture
of [Nov 7, 2012]
Motivating LDA in terms of the need to keepup with Zhou Dynasty
elementary school kids. A short fastpaced introduction to
probabilistic graphical models, with specific emphasis on
bayes nets.

L20 Lecture
of [Nov 14, 2012]
Inference in Bayes Nets; exact vs. approximate inference. Approximate
inference based on sampling from the network. Rejection
sampling vs. likelihood weighting. Markov Chain Monte
Carlo and Gibbs sampling.

L21 Lecture
of [Nov 19, 2012]
Learning in Bayes Nets. Bayesian Learning as just inference of the
posterior of hypothesis distribution. MAP and ML as
approximations of Bayes Learning. Application of ML to
Bayes Net learning with known topology and complete data.

L22 Lecture
of [Nov 21, 2012]
Bayes Net learning in the case of unknown topology and in the case of
incomplete data.

L23 Lecture
of [Nov 26, 2012]
Back to Bayes Learning. Understanding the challenges of bayes learning
when the hypothesis space is large (or
continuous). Understanding the need for conjugate
distributions. Understanding Beta distribution as a 2D
version of Dirichlet (or Dirichlet as the Kdimensional
version of Beta). Understanding how beta prior looks for
different hyper parameters, and what sort of a constraint
it puts on the priors you can express. Computing bayes
posterior and understanding graphical model version of
bayes learning.


L24 Lecture
of [Nov 28, 2012]
Bayes prediction; bayes learning on multiparameter
networks. Statistical models for text. Positional model
and why it is infeasible. Unigram model and its plate
model. Brief discussion of bigram model as an
extension. Topic model as another extension. Single topic
model. Viewing topics as points in the word simplex. Plate
model for single topic model. Seeing complete data
learning case as the Naive Bayes Classifier. Seeing
incomplete data learning as fitting single topics for the
documents (and how either MLE or Bayes Learning can be
used). Motivating LDA as an extension from single to
multiple topics per document.

L25 Lecture
of [Dec 3, 2012]
Bayes learning of unigram and singletopic models. Discussion on naive
bayes classifer and why it works well as a classifier even
though it is lousy as a generative model. Documents as
distribution over topicsthe progression of ideas from
LSA to PLSA to LDA. LSA and matrix factorization
methods. PLSA as providing statistical version of LSA. LDA
as bayesian learning of PLSA. Along the way, a whole lot
of topics, importance of having documents with more than
one topic but not too
many topics, and how dirichlet prior can be set to ensure
this sparsity.

L26 Lecture
of [Dec 5, 2012]
More discussion of LDA and its connections to LSA and PLSA; discussion
of generative vs. discriminative models, and how LDA,
being a generative model, can be embedded in other
generative models. Discussion of Yuheng's ETLDA framework
for aligning event and tweets as an example of this
flexibility. Inference with LDA and collapsed gibbs
sampling.

L27 Lecture
of [Dec 10, 2012]
End of LDA. More discussion on generative vs. discriminative
models. Discussion on Markov Nets and how they compare to
Bayes Nets.
Subbarao Kambhampati
Last modified: Wed Nov 7 15:55:06 MST 2012