[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Some hints on the LSI question (HW 1.4)
- To: cse494-s01@asu.edu
- Subject: Some hints on the LSI question (HW 1.4)
- From: Subbarao Kambhampati <rao@asu.edu>
- Date: Sat, 17 Feb 2001 21:40:37 -0700
In doing the LSI question, remember that the singular values are the square
roots of the eigen values of the term-term and doc-doc correlation matrices
MM' and M'M respectively (where M' is the transpose of the term-doc matrix M)
Given a non-square M (which is most likely the case), the correlation
matrices MM' and M'M will be of different sizes, but they still will have
the same number of non-zero eighen values.
You can use this to your advantage by computing the eigen values of the
smaller matrix
(typically, if there are fewer terms than documents, then MM' --the term
correlation matrix will be smaller).
Also, while eigen values can in general be positive or negative (or even
complex conjugates),
the eigen values of MM' and M'M are guaranteed to be >= 0 because these
matrices are "positive definite" (thus you don't have the worry of hvaing
to take square root of negative numbers).
Rao
==========
Ps: for those of you curious folk..
notice that if M were a square matrix (num terms= num docs), then M can
have two types of decompositions:
M = K L K-
(where L is the set of eigen values of M, and K and K- are the matrix of
eigen vectors of M and the invese of S respectively)
as well as
M = U S V'
(which is the familiar singular value decomposition)
The first is the called the spectral decomposition and the second the
singular value
decomposition. The SVD has the property that S will always be a diagonal
matrix with
+ve diagonal entries, while the spectral decomposition may or may not have
this property.
SVD and spectral decomposition of M will be the same only when M itself
happens to be a
positive definite matrix (in fact, SVD was invented just to deal with the
fact that not all
matrices are positive semidefinite, in addition to not being square).