CS 33510: Data Mining
Winter 2007
Course Description
Data mining, an emerging field at the intersection of machine
learning, statistics, and databases, is broadly defined as finding
novel and interesting patterns in large amounts of data. In this
research-oriented course, we survey data-mining techniques and
applications, emphasizing the database perspective. Major themes
include association rules, graph search and mining, information
extraction, and bioinformatics applications. The course involves an
independent data-mining project.
Prerequisites
- CS 235 or CSPP 53001 or equivalent (databases)
- Strong programming skills
- CS 270 or CSPP 55001 or equivalent (algorithms)
Course Staff
Svetlozar Nestorov is the instructor for this course.
Contact Info
- Office: Ry275-A
- Email: evtimov at cs.uchicago.edu
- Phone: 2-3497
- Office hours: by appointment
Textboook
There is no required textbook. You may consider getting some of the
following books on data mining:
- Principles of Data Mining by David J. Hand, Heikki Mannila, Padhraic Smyth
- Data Mining: Concepts and Techniques by Jiawei Han, Micheline Kamber
Project
Project proposals are due in class on Thursday, Feb 1, 2007. Project
presentation will be held during the last week of class (Mar 6-8,
2007).
Grading Policy
Grades will be based on class participation,
presentations, two quizes, and projects. The first quiz will be on
Jan 25, 2007. The second quiz will be on Feb 22, 2007. Both quizes
will be in class and last 20 minutes.
Data Mining Conferences
An exhaustive list.
Some (Old) News
API and Data
Background Papers
Papers
Association Rules
Two Applications of Association Rules in Bioinformatics
More Association Rules
An exhaustive list from Google Scholar (about 20,000 artciles).
The Web Graph