Data Mining II
Advanced data mining techniques: temporal pattern mining, network mining, boosting, discriminative models, generative models, data warehouse, and choosing mining algorithms. IST (STAT) 558 Data Mining II (3)This course is the second course in a two-course sequence on data mining. It emphasizes advanced concepts and techniques for data mining and their application to large-scale data warehouse. Building on the statistical foundations and underpinnings of data mining introduced in Data Mining I , this course covers advanced topics on data mining; mining association rules from large-scale data warehouse, hierarchical clustering, mining patterns from temporal data, semi-supervised learning, active learning and boosting. In addition, to computational aspects of algorithm implementation, the course will also cover architecture and implementation of data warehouse, data preprocessing (including data cleansing), and the choice of mining algorithms for applications. In addition to discriminative models such as CRF and SVM models, the course will also introduce generative models such as Bayesian Net and LDA. A term project will be developed by each student to apply an advanced data mining algorithm to a multi-dimensional data set. Classes will include lectures, paper discussions, and project presentations. Paper discussions will allow students to discuss state-of-the-art literature related to data mining. Project presentations will enable students to share and compare project ideas with each other and to receive feedback from the instructor.