[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

the (in)famous illustration of grading using K-means

To: cse494-f05@parichaalak.eas.asu.edu
Subject: the (in)famous illustration of grading using K-means
From: Subbarao Kambhampati <rao@asu.edu>
Date: Wed, 21 Sep 2005 16:11:42 -0700

Folks

In the class today, I mentioned about how I demonstrated K-means using the mid-term score the last time around. Here is the mail I sent to the mailing list at that time. Although the impact is not as great on you since it is not _your_ marks that are being clustered, you can still, I am sure, feel sympathetic pangs of understanding

(You may also note, from the example below, that the distribution of marks on my exams tend to have very high variance...)

yours in questionable taste
Rao

===============================

The midterm has been graded. The stats are: avg: 29  standard
deviation:14.5
33 people took the exam which was for 63 points (not 65, as i seem to
have miscounted)
There are  1 above 60
one between 50-60
4 between 40-50
8 between 40-30
8 between 30-20
7 between 20-10
4 below 10

I will return the exams at the end of the class on Monday. don't ask
me for marks before that.

Now I am sure you all want to know what would be "A", "B" etc cut offs
just for this exam.

I thought we can use the k-means algorithm to do clustering. After
all, the usual problem with k-means, that it requires us to
pre-specify number of clusters is not really a problem here since we are
going to have a fixed number of letter grades anyway.

So, I wrote up a little (lisp) program for k-means and ran it on the
list of 33 individual scores. Here, for your edification are the
results:

Case 1: 5 clusters (A,B,C,D,E)

case 1
USER(218): (k-means mlist 5  :key #'mark-val)

>>>>((61.5) (38) (32) (26) (17.5)) >>>>((61.5 55) (48 47.5 47.5 47.5 38 37 35) (34.5 32.5 32.5 32 30 29) (28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:113.07143 >>>>((61.5 55) (48 47.5 47.5 47.5 38) (37 35 34.5 32.5 32.5 32 30 29) (28 27 27 26 25.5 22.5 20.5) (19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:97.791214 >>>>((61.5 55) (48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30) (29 28 27 27 26 25.5 22.5 20.5 19) (18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:88.91668 >>>>((61.5 55) (48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30) (29 28 27 27 26 25.5 22.5 20.5 19) (18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:88.91668


;;Notice that the k-means starts with some default cluster centers and
;;iterates until clusters stablize. So the last ">>>" above is the
;;cluster we get.
;;(if we stopped with the first iteration people with 34.5 would have
;;gotten A ;-)

 That dissimilairty measure is reduced from iteration to iteration
    in each run

[[For the rest of the examples--except Case 1', I don't show cluster dissimilarity measure]]


Case 2: 4 clusters (A,B,C,D) (assume I decided not to give Es)
USER(162): (k-means mlist 4 :key #'mark-val)

>>>>((61.5) (35) (27) (17.5)) >>>>((61.5 55) (48 47.5 47.5 47.5 38 37 35 34.5 32.5 32.5 32) (30 29 28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55) (48 47.5 47.5 47.5 38 37 35 34.5) (32.5 32.5 32 30 29 28 27 27 26 25.5 22.5 20.5) (19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55) (48 47.5 47.5 47.5 38 37 35) (34.5 32.5 32.5 32 30 29 28 27 27 26 25.5 22.5 20.5) (19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55) (48 47.5 47.5 47.5 38 37) (35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5 22.5 20.5) (19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55) (48 47.5 47.5 47.5 38 37) (35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55) (48 47.5 47.5 47.5 38 37) (35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4))


Case 3: 4 clusters (A,B,C,D) but we remove the highest (61.5) from
consideration (assume that we give that person an A+ or just physically push
that person out of the class for the sake of collective happiness).

SER(208): (k-means (cdr mlist) 4 :key #'mark-val)

>>>>((55) (34.5) (27) (17)) >>>>((55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32) (30 29 28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) >>>>((55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32) (30 29 28 27 27 26 25.5 22.5 20.5) (19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4)) >>>>((55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32) (30 29 28 27 27 26 25.5 22.5 20.5) (19 18 17.5 17 13.5 13 11.5 9.5 8.5 7 4))

As expected, with 61.5 tossed out of the class, a lot more people get
As ;-)


Case 1': Clustering with a different set of initial centers
 we will repeat case 3, except we start with different centers

case 1'
USER(219): (k-means mlist-r 5  :key #'mark-val)

>>>>((35) (32) (26) (18) (17)) >>>>((61.5 55 48 47.5 47.5 47.5 38 37 35 34.5) (32.5 32.5 32 30 29) (28 27 27 26 25.5 22.5) (20.5 19 18 17.5) (17 13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:117.0 >>>>((61.5 55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30 29) (28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:82.19365 >>>>((61.5 55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30) (29 28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:80.37619 >>>>((61.5 55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32) (30 29 28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:78.55476 >>>>((61.5 55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32) (30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:78.571434 >>>>((61.5 55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32) (30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) --Dissimilarity Measure:78.571434

Compare this to case 1--see that we now converged to an entirely
different cluster just because we started from new centers

Note
* that the lowest dissimilarity attained depends on the original
    cluster centers. This is a consequence of the fact that K-means is
    a greedy algorithm and is not finding clusters with globally
    lowest cluster dissimilarities.

* It is nice to see that the clusters found in case 1' are better
    (according to the dissimilarity metric) than those found in case 1
   (because this means that giving more As is in fact a better
     idea according to k-means ;-)

case 2': repeat case 2 with different centers

USER(209): (k-means mlist-r 4 :key #'mark-val)

>>>>((35) (22.5) (18) (9.5)) >>>>((61.5 55 48 47.5 47.5 47.5 38 37 35 34.5 32.5 32.5 32 30 29) (28 27 27 26 25.5 22.5 20.5) (19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55 48 47.5 47.5 47.5 38 37 35 34.5) (32.5 32.5 32 30 29 28 27 27 26 25.5 22.5) (20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55 48 47.5 47.5 47.5 38 37) (35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((61.5 55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4))

;;interesting how going from 5 to four just merges old B and C...

case 3': repeat case 3 with different centers (we just removed 61.5)

USER(211): (k-means mlist-r-2 4 :key #'mark-val)

>>>>((35) (30) (22.5) (9.5)) >>>>((55 48 47.5 47.5 47.5 38 37 35 34.5 32.5 32.5) (32 30 29 28 27 27) (26 25.5 22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((55 48 47.5 47.5 47.5 38 37) (35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((55 48 47.5 47.5 47.5 38) (37 35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4)) >>>>((55 48 47.5 47.5 47.5) (38 37 35 34.5 32.5 32.5 32 30 29 28 27 27 26 25.5) (22.5 20.5 19 18 17.5 17) (13.5 13 11.5 9.5 8.5 7 4))

Notice the non-local and somewhat non-intuitive effect on the
clusters.. A cutoff still remains 47.5, but B now got chopped up..

You would have thought more people will get better grades if the
highest guy gets thrown out... and yet...

In summary, you wouldn't trust your grades to K-means algorithm based
clustering because:

1. its clusters are sensitive to initial centers

2. its clusters are sensitive to outliers

3. deletion of elements might have non local effects on its clusters

I know I know. You see this whole thing as a joke in bad taste....

Rao
=============

Prev by Date: Homework 2 socket closed; Due October 3rd
Next by Date: Self-organizing maps vs. K-Means (optional)
Previous by thread: Homework 2 socket closed; Due October 3rd
Next by thread: Self-organizing maps vs. K-Means (optional)
Index(es):
- Date
- Thread