[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
On using similarities to compute distances in Qn 5 [Re: Question About Question on Final]
At 05:12 PM 5/8/2001 -0700, you wrote:
>Heya,
> I've a question about question five, the k-means
>question. You say to show the cluster dissimilarity measure
>for each iteration, and define that as:
>"the sum of the similarities of docs from their respective
>cluster centers".
>This number increases if the documents in the cluster are
>all very similar to each other, and decreases if they are
>very dissimilar. This does not seem like a dissimilarity
>measure?
Since we are using similarities to represent distances, the general
idea is that distance is inversely proportional to similarity (t the higher
the similarity, the lower the distance. )
FOr this specific question, either you can give the "aggregate similarity
measure", which should keep increasing,
or aggregage dissimilarity measure where you define dissimilarity to
be--say-- 1/(similarity)
Either one will be enough for our purpose.
Rao