[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

If you are wondering whatever happened to the "mining" part of the course




The course title has the word "mining" in it, and in my introductory
lecture, I suggested that I will likely spend a third of the time on
mining. 

Mining or learning intersting patterns from data is a general
topic. "Data" mining assumes that you are trying to learn patterns
from structured data. One can also mine patterns from non-structural
data such as text or hyper-text, and this later typically goes under
the name "web-mining". The foundations for much of mining come from
machine learning and pattern recognition. 

I admit that I was being wildly optimistic, and that one-third of the
course on mining never happened,  as the retrieval and integration
parts grew to displace it. 

While I did not spend time on datamining per se, we did consider
mining in several contexts as related to text and hyper-text. 

1. The work on A/H computation and page-rank computation can be seen
   as mining of the link structure of the web pages to discover
   interesting properties of the pages.

2. We talked about clustering techniques--especially as related to the
   idea of post-processing the results of a search engine. The k-means
   and agglomerative clustering techniques we discussed in the class
   are also part of unsupervised datamining techniques.

3. We talked, mostly in passing, about how classification techniques
   are used in topic-specific crawlers. (Class of 3/5). 

If you want to learn more about mining, you might consider two courses
offered in ASU:

CSE 471 Intro to AI covers some basic classification learning algorithms.

Huan Liu offers a special topics course on Data Mining in CSE.

Kari Tarkkola offers a course on neural networks--which basically
covers pattern recognition methods in EEE. 

Kari is slated to offer the neural networks course and I recommend it
highly to anyone with a bit of mathematical bent. I am teaching 471
next fall (unfortunately for all concerned ;-) 

Rao
[Apr 29, 2001]