[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HITS normalization




b> What does the normalization step do for me when I am doing the 
b> authority and hub scoring?  Can I do with out that step?
b> 


Good question. 

If you don't do normalization stpe, then the basic problem is that 

(a) the authority/hub values will be dependent on the initial values
    and will grow without limit as the iterations
    continue 

if you recall the geometric interpretation of what is happening, we
are going to be making the initial vector's projection in the
principal eigen vector's direction longer and longer. So, even after
the resulting vector is virtually inthe same direction as the
principal eigen vectors, its length can (and will) continue to
increase as we continue iterations

what this means is that the vectors in two different iterations will
be very different from each other. The only thing that is converging
about them is that they are both in the same direction. 

so to decide when to stop you have to either (a) take the cosine-theta 
between the vectors and see when it becomes close enough to 1 or (b)
compare normalized versions of the vectors (i.e., unit vectors)

The latter is a better idea because we also get the added benefit of
not being dependent on the initial vector.

rao

ps: here is a question--would the normalization step be important for
page-rank algorithm? The answer is no. But why?