[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Things you were asked to do...(and will pay dearly if you don't :-)



Here are some of the things I said you should do at home before next class:

Readings:
 --Read the refresher reading(s) on Eigen values and eigen vectors
 --Read the "fishy" informal introduction to the Principal Components
    analysis 
  [Both the above are from the refreshers]

 -- Start reading the paper on Latent Semantic Indexing (in the
readigns list)


Things to do/questions to answer:

 --Does google use non-obvious stopwords (i.e., words other than "the" 
etc)? 

 -- Does google allow user to supply weights to keywords (e.g. by
repeating a certain keyword more times in the query)?


--a related question, that is a bit harder to check righaway, is--does 
google really take into account the frequency of a specific word in
the document? [Or do you think all those folks who want to increase
the hit rate for their pages by having fifteen million instances of
the word "Sex" (and its variations) in their pages deluding themselves?)


(--What should you ask "THE GOD (TM)" if it were to come by and offer
you a tricky boon that whatever you get will be given to you ask well
as all the other folks.?)


Rao

ps: A minor error: When mentioning the Li norms, where ith norm of a
vector is defined as the nth root of the sum of the nth powers of its
coefficients, with the normal eucledian norm being the L1 norm, I said
that the sum of absolute values norm is called L0. Actually, it is
called "Linfinity" (presumably because people knwo that the 1th root
of x raised to 0 is 1 while they have no idea what the infinitieth
root of x raised to infinity is... although you can actually show that
this makes sense in a limit-theoretic way).