CS 33510: Data Mining

Winter 2007

Course Description

Data mining, an emerging field at the intersection of machine learning, statistics, and databases, is broadly defined as finding novel and interesting patterns in large amounts of data. In this research-oriented course, we survey data-mining techniques and applications, emphasizing the database perspective. Major themes include association rules, graph search and mining, information extraction, and bioinformatics applications. The course involves an independent data-mining project.

Prerequisites

Course Staff

Svetlozar Nestorov is the instructor for this course.

Contact Info

Textboook

There is no required textbook. You may consider getting some of the following books on data mining:

Project

Project proposals are due in class on Thursday, Feb 1, 2007. Project presentation will be held during the last week of class (Mar 6-8, 2007).

Grading Policy

Grades will be based on class participation, presentations, two quizes, and projects. The first quiz will be on Jan 25, 2007. The second quiz will be on Feb 22, 2007. Both quizes will be in class and last 20 minutes.

Data Mining Conferences

An exhaustive list.

Some (Old) News

API and Data

Background Papers

Papers

Association Rules

Two Applications of Association Rules in Bioinformatics

More Association Rules

An exhaustive list from Google Scholar (about 20,000 artciles).

The Web Graph

Information Extraction

Protein Networks

Classification