Description

Over the past decade there has been an exponential increase in the amount of data. This has lead to development of techniques to discover useful and interesting information from the large collections of data. This course aims to provide a overview of the key data mining methods and techniques like classification, clustering, and association rule mining. The course will also provide interesting application examples of data mining, especially in the field of social media analysis, text analysis and learning analytics.

General Information

Lecture Location
Innovation Hall 134
Lecture Time
Mondays 4:30-7:10 pm
Pre-requistes
Programming experience in Python strongly preferred. Java or C will work as well. Students should be familiar with basic probability and statistics concepts, and linear algebra. Please expect programming in all the assignments and class projects.
Course Format
Lectures will be given by the instructor. Besides material from the textbook, topics not discussed in the book may also be covered. Research papers and handouts of material not covered in the book will be made available. Grading will be based on homework assignments, exams, and a project. Homework assignments will require intensive programming. Exams and homework assignments must be done on an individual basis. Any deviation from this policy will be considered a violation of the GMU Honor Code.
Learning Outcomes
As an outcome of taking this class, a student will be able to
- Understand the various classification, clustering, association rule-mining algorithms.
- Apply the data mining techniques learned to real world scientific and/or industrial applications.
Textbook
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar Introduction to Data Mining, Addison Wesley, 2006. Book Website: http://www-users.cs.umn.edu/~kumar/dmbook/index.php

Announcements

Presentations for Dec 12
12/06/16 10:03 AM

https://www.dropbox.com/request/7OTZCGDp67safc1Yjr34

Dear Presenters for Dec 12,

Please upload your presentations in PDF or PPTX/PPT via the link above by Dec 12, 10 am EST. 

Remember:  Each presentation should be timed to be 10 min + 2 minutes of questions. 

Thanks for all the wonderful presentations last night -- Don't forget to Vote @91

Huzefa

Project Presentation/Report Format & Guidelines
11/17/16 12:46 PM

Here is a grading rubric for the presentation: https://cs.gmu.edu/~hrangwal/node/366

 

and grading rubric for the report: https://cs.gmu.edu/~hrangwal/node/365

 

Here is a sample suggestion of what should go in a 8-page (max) report:  https://cs.gmu.edu/~hrangwal/files/project.pdf

Presentations on Dec 5 and Dec 12 + Ordering
11/16/16 11:44 AM

Here is the allocation for projects. Please let me know if I missed anyone. 

Thanks,

Huzefa

December 5, 2016 (10 projects)

  • Sneha Nagpaul
  • Veda Gaddam, KALYANI RACHAKONDA, Zain Ibrahim
  • Dan Robinson, Benjamin Lafko
  • Joseph Carneiro
  • Nitya Nair, Devika Ashok, 
  • Yujing Chen, H. Zhang
  • Yang Zhou
  • Li Zhang
  • Patcharaporn Munchupaiboon, Voravan Charnsawat
  • Ben Brumback
  • Rob Jarvis and Jeff Schneider

December 12, 2016 (11 projects)

  • Darron Fuller, S. Armstrong, N. Christian
  • Niharika Bitra
  • Bhavika Tekwani
  • TImothy Norton
  • Abhishek Kamath
  • Kyle Jackson, R. Troung
  • Ananya Sain Dhawan
  • Adithya Velayudham
  • Nathan Obert
  • Christian Contreras
  • Ting Zhang
Miner Survey (Please Fill)
11/15/16 1:42 PM

Survey Link for Evaluation of Miner: 

https://goo.gl/forms/YueGSMdQ6XDIXiuq2

Dear All,

For CS 584 Data Mining, I designed and developed miner; an analytics competition hosting framework that allowed for submissions of prediction and clustering models for blind evaluation. My goals were to encourage you to learn the class topics by a ``learning-by-doing'' philosophy and implementing your own creative solutions towards each of the four home works.

I want to assess how the this framework and the four home-works helped you learn about data mining and analytics tasks within the context of real world applications.

I would like to hear from you. Positives or Negatives or anything.

Thank you for your time and providing me feedback as I develop and improve this service for future classes.


Huzefa

PS: This is an anonymous survey.

Mid-Term Reminder
10/17/16 9:39 AM

Dear All, 

Good luck for the mid-term exam today.

Huzefa

Project Proposal Guidelines
10/08/16 8:23 PM

Dear All, 

Please find attached project proposal requirements that are due on 10/31/2016.

project_proposal_F2016.pdf

Thanks,

Huzefa

Readings for Exam on 10/17
10/04/16 1:30 PM

Exam on 10/17 will be closed*. Calculators OK. 

Readings from Text Book "Introduction to Data Mining" by Tan et. al.

  • Chapters 1, 2, 4, 5 and 8

Sample Exam Questions from Prior Years: 

Suggested Problems to Solve from your textbook

Chapter 2: 13, 14, 16, 19

Chapter 4: 2, 3, 7

Chapter 5: 7, 8, 10


Although these are good practice problems to prepare for the exam, there is no guarantee that the same kind of questions will be on the exam. Other topics may also be covered in the exam.

 

HW2 logs and labels
10/03/16 10:16 PM
Staff Office Hours
NameOffice Hours
Huzefa Rangwala
When?
Where?
Monjura Afrin Rumi
When?
Where?

Homework

Lecture Notes

Lecture Notes
Lecture Date