[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Thinking Cap--Easter Resurrection] on Classification/clustering
- To: Rao Kambhampati <rao@asu.edu>
- Subject: [Thinking Cap--Easter Resurrection] on Classification/clustering
- From: Subbarao Kambhampati <rao@asu.edu>
- Date: Tue, 6 Apr 2010 16:46:54 -0700
- Cc: cse494-s10@rakaposhi.eas.asu.edu
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:date :x-google-sender-auth:received:message-id:subject:from:to:cc :content-type; bh=5x23tnafA7HqU0Xsk0loAbHTxWBB0oAQxdqs22fDqDc=; b=TyOrG/C47Oij8aC/4a4+0oYQWrLlF3hTez8UA2WexOt447yakhEbDD3WJHbB4STOr3 0DsA5ghS074fyLOfF64vmj+7OUGz3qoiWFRWToT/wY//o+Xi8l6870F5B8nyP77eXPwo ANWVgtrZPsjOvk/E3xxuP/NtAs+WanvZ+Syhs=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; b=PjyKp/rt1Pg+Cy0pUBiBbuBBxyi52b09zfQbXQm30zAgu7Jnkjz6r5AeqqF7du/hKm M/uu1qrG6BjWZRbba3tW2Jh7fpFeycSs+p0kkmMz6jDnOZmy527kA0BGJ06TTwo/CPWm TePXd+AWrshKRTOxIF9029wW/GcXFhhVpuFQw=
- Sender: subbarao2z2@gmail.com
Comment on this on the blog
At the end of today's class, we saw that classification is in some sense a pretty easy extension of clustering--training data with different labels can be seen to be making up the different clusters. When test data comes, we just need to figure out which cluster it is closest to and assign it the label of that cluster.
1. If classification is so darned straightforward, how come the whole entire field of machine learning is obsessed with more and more approaches for classification? What can be possibly wrong with the straightforward one we outlined? Can you list any problems our simple approach can run into? (Alternately, it is fine to just decide that Jieping Ye and Huan Liu cannot leave good enough alone... :-)
2. If you listed some problems in 1 (as against casting aspersions on Ye and Liu), then can you comment on the ramifications of those problems on clustering itself? Or is it that clustering is still pretty fine as it is?
rao