Introduction

What is machine learning?
"Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. What this means in most cases, is that an algorithm is given a set of data and infers information about the properties of the data--and that information allows it to make predictions about other data it might see in the future. This is possible because almost all nonrandom data contains patterns, and these patterns allow the machine to generalize. In order to generalize, it trains a model with what it determines are the important aspects of the data."
--Toby Segaran, Programming Collective Intelligence

A more concise definition:
Machine learning allows computers to observe input and produce a desired output, either by example or through identifying latent patterns in the input.

This course takes an application driven approach to current topics in machine learning. The course covers supervised learning (classification/structured prediction/regression) and unsupervised learning (dimensionality reduction, bayesian modeling, clustering). The course will also consider challenges resulting from learning applications. We will cover popular algorithms (naive Bayes, SVM, perceptron, HMM, k-means, maximum entropy) and will focus on how statistical learning algorithms are applied to real world applications. Students in the course will implement several learning algorithms and apply machine learning to an application as a final project.

Goals

The course has several goals:
  • Students will learn the fundamentals of machine learning
  • Students will learn to implement machine learning algorithms
  • Students will learn to evaluate how to apply machine learning to different settings


Requirements

Students are expected to have:
  • Strong programming skills in Python. There will be considerable programming required for the homeworks.
  • Comfort with relevant mathematical topics (linear algebra, multi-variate calculus, probability)


Grading

  • Homework: 30%
  • Programming assignments: 20%
  • Midterm: 20%
  • Final Exam: 30%


Homework

Since the focus of the course is on practical applications of machine learning, the bulk of the final grade comes from homework. Homeworks are comprised of both written problems and programming projects. Homeworks are to be turned in electronically. Instructions will be provided when homeworks are assigned. There will be about six homeworks during the semester.


Late Policy

Late homework assignments will be accepted up to 48 hours past the due date for a 50% reduction in grade. However, every student is permitted to hand-in homeworks late penalty free using a 72-hour grace period for the entire semester. This means that you can choose to hand-in the first homework 70 hours late and the second homework 2 hours late, but then every other homework must be on time for the rest of the semester. You may divide these 72 hours as you see fit, but once you have used up all of the time, you will be given no more.