Description

Humans have an uncanny ability to learn from their mistakes and adapt to new environments by relying on their past experience. Machine learning focuses on "How to write a computer program than can improve performance through experience?" Machine learning has a huge number of practical applications, more so in the present era of Big Data, where staggering volumes of diverse data in almost every facet of society, science, engineering, and commerce, are presenting opportunities for valuable discoveries. For example, machine learning is being used to understand financial markets, impact of climate change on society, protein-protein interactions, diseases, etc. Machine learning also has far ranging applications such as self-driving cars to never ending language learning systems.

This course will focus on understanding the mathematical and statistical foundations of machine learning. We will also cover the core set of techniques and algorithms needed to understand the practical applications of machine learning. The course will be an integrated view of machine learning, statistics (classical and Bayesian), data mining, and information theory. A basic understanding of probability, statistics, algorithms, and linear algebra is expected. Familiarity with Python is required for homework assignments and for understanding in-class demonstrations.

Following topics will be covered:
Concept learning: Hypothesis Space, Version Space
Computational learning theory: VC Dimension, PAC Learning
Supervised Learning: Instance based learning, margin-based classifiers, linear classification models, neural networks
Kernel Methods: Support Vector Machines
Unsupervised Learning: Clustering, Mixture of Gaussians

General Information

Prerequisites
CSE 250 and any of EAS 305/308, STA 401/421, MTH 309; or permission of instructor.
Course Website
https://piazza.com/buffalo/spring2018/cse474/home

http://www.cse.buffalo.edu/~chandola/machinelearning.html

All announcements, course material, and related information will be communicated through the course website. Enrollment information will be emailed to your UBIT email address before the start of the class.
Contact Information
For fastest response to any questions related to course content, assignments or exam, post it on the Piazza forum. When emailing instructor or TA, please add CSE574 or CSE474 (as appropriate) at the beginning of the subject line.
Textbook
There is no required text for this course. Notes will be posted periodically on the course web site. The following books are recommended as optional reading:
- Tom Mitchell, Machine Learning. McGraw-Hill, 1997.
- David Mackay, Information Theory, Inference, and Learning Algorithms, Cambridge Press, 2003.
- Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012.
- Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning. Springer, 2009.
- Chris Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
- Richard Duda, Peter Hart and David Stork, Pattern Classification, 2nd ed. John Wiley & Sons, 2001.
Grading Information
Course grades will be computed based on the following factors (subject to changes):
1. Short weekly quizzes using Gradiance (12) -- 20%
2. Programming Assignments (3) -- 30%
3. Mid-term exam (in-class, open book/notes) on 03/16/2018 -- 20%
4. Final exam (in-class, open book/notes) on 05/16/2018 -- 30%

Letter grades will be given in the range of F to A (with minuses and pluses). The grading will be done on a curve; this is based on overall class performance.

Late submission and missed exam policy: There will be no late days allowed for Gradiance quizzes. No late days for homework and programming assignments will be allowed. No make-up exams for the final exam will be administered other than for University approved reasons.

Submission: Students will need to use the UBLearns system to submit all assignments.
Gradiance
Here are the instructions for weekly quizzes on Gradiance.
1. Go to http://www.newgradiance.com/services.
2. A username will be sent to you before the start of the semester.
3. Register using the sent username and use the class token FC4761F5.
4. A warm up quiz will be posted at the beginning of the semester.

The Gradiance quiz will involve around three multiple choice questions per week which will test your understanding of the concepts taught during that week. The Gradiance quiz will be made available on Monday morning every week and will be due on Sunday at midnight. The secret is that each of the questions involves a "long-answer" problem, which you should work. Without solving the full problem, it will not be easy to get the right answer as the Gradiance system gives you random right and wrong answers each time you open it. Solutions appear after the problem-set is due. However, you must submit at least once, so your most recent solution appears with the solutions embedded.
Python
We will be using Python, iPython Notebooks and scikit-learn library for all class demonstrations and programming assignments. To get familiar with Python, notebooks and the related libraries, please refer to the following online resources:
-- Installing python, ipython - http://ipython.org/install.html
-- Python IDE - https://store.enthought.com/downloads/#default
-- More about notebooks - http://ipython.org/notebook.html
-- Python for Developers, a complete book on Python programming by Ricardo Duarte - http://ricardoduarte.github.io/python-for-developers/
-- An introduction to machine learning with Python and scikit-learn (repo and overview) by Hannes Schulz and Andreas Mueller - http://nbviewer.ipython.org/github/temporaer/tutorial_ml_gkbionics/blob/master/2%20-%20KMeans.ipynb

Announcements

Announcements are not public for this course.
Staff Office Hours
NameOffice Hours
Varun Chandola
When?
Where?
Xin MA
When?
Where?
Hongfei Xue
When?
Where?
Rudra Prasad Baksi
When?
Where?