[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Correlation coefficient viewed as cosine theta measure..
- To: Rao Kambhampati <rao@asu.edu>
- Subject: Correlation coefficient viewed as cosine theta measure..
- From: Subbarao Kambhampati <rao@asu.edu>
- Date: Sat, 17 Apr 2010 07:23:21 -0700
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:date :x-google-sender-auth:received:message-id:subject:from:to :content-type; bh=jbz0aK9+KzeZlSnZL338DyQBSHcns+K5fmkNbaEGlbU=; b=IO1Tn2wwrmCs7nYjVNM6ucashKasmSVtpXknL9Ft4HBzzLkKbz5cmdFgkafYfXWtbv MZWMWOdIQXOgU+lzz0Pc8EHgmJd0acXhgTN5G/MIqPiEhiu1dGJaUqTQgFJliwibj6NC TF2qxOeXfc6VSVzhkhq6I73HAQnJFK4j31hQY=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; b=VSJmNFAEvTVVSGRiOTECPLkB8hAw9HU8tWvMkEsWYfFTglIcyJlRgjkHaq+4DjerMw /+/GP664d0ldU2odWmX1cNor/9iZJVl29F5mhU3Xb28Zz7Oe4TpNPi9vKY3jjqU6WSbn MB0csbf25wEMxyJS95kQmJoQwHSDdVgeTFuaQ=
- Sender: subbarao2z2@gmail.com
Although I didn't mention it explicitly in the class, perason correlation coefficient can be seen as thevector similarity between "centered rating vectors"
Suppose the two rating vectors are
[r11 r12 r13 r14]
and
[r21 r22 r23 r24]
Centering means subtracting the mean of the vector from the vector
let r1 be the mean of r11..r14 and r2 be the mean of r21 ..r24
then centered vectors are
[r11-r1 r12-r1 r13-r1 r14-r1]
[r21-r2 r22-r2 r23-r2 r24-r2]
now if you take the cosine theta metric between these two vectors, you get
dot product divided by the norm of both vectors.
dot product will be of the form [r11-r1]*[r21-r2]+ ...+[r14-r1]*[r24-r2]
this is the numerator of pearson correlation coefficient.
the norm of the first vector is
sqrt [( r11-r1)^2+..(r14-r1]^2]
which can also be viewed as the squared variance of the first vector..
QED
Rao