About the content
Case Studies: Analyzing Sentiment & Loan Default Prediction In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification. In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper! Learning Objectives: By the end of this course, you will be able to: -Describe the input and output of a classification model. -Tackle both binary and multiclass classification problems. -Implement a logistic regression model for large-scale classification. -Create a non-linear model using decision trees. -Improve the performance of any model using boosting. -Scale your methods with stochastic gradient ascent. -Describe the underlying decision boundaries. -Build a classification model to predict sentiment in a product review dataset. -Analyze financial data to predict loan defaults. -Use techniques for handling missing data. -Evaluate your models using precision-recall metrics. -Implement these techniques in Python (or in the language of your choice, though Python is highly recommended).
- Week 1 - Welcome!
Classification is one of the most widely used techniques in machine learning, with a broad array of applications, including sentiment analysis, ad targeting, spam detection, risk assessment, medical diagnosis and image classification. The core goal of classifi...
- Week 1 - Linear Classifiers & Logistic Regression
Linear classifiers are amongst the most practical classification methods. For example, in our sentiment analysis case-study, a linear classifier associates a coefficient with the counts of each word in the sentence. In this module, you will become proficient i...
- Week 2 - Learning Linear Classifiers
Once familiar with linear classifiers and logistic regression, you can now dive in and write your first learning algorithm for classification. In particular, you will use gradient ascent to learn the coefficients of your classifier from data. You first will ne...
- Week 2 - Overfitting & Regularization in Logistic Regression
As we saw in the regression course, overfitting is perhaps the most significant challenge you will face as you apply machine learning approaches in practice. This challenge can be particularly significant for logistic regression, as you will discover in this m...
- Week 3 - Decision Trees
Along with linear classifiers, decision trees are amongst the most widely used classification techniques in the real world. This method is extremely intuitive, simple to implement and provides interpretable predictions. In this module, you will become familiar...
- Week 4 - Preventing Overfitting in Decision Trees
Out of all machine learning techniques, decision trees are amongst the most prone to overfitting. No practical implementation is possible without including approaches that mitigate this challenge. In this module, through various visualizations and investigatio...
- Week 4 - Handling Missing Data
Real-world machine learning problems are fraught with missing data. That is, very often, some of the inputs are not observed for all data points. This challenge is very significant, happens in most cases, and needs to be addressed carefully to obtain great per...
- Week 5 - Boosting
One of the most exciting theoretical questions that have been asked about machine learning is whether simple classifiers can be combined into a highly accurate ensemble. This question lead to the developing of boosting, one of the most important and practical ...
- Week 6 - Precision-Recall
In many real-world settings, accuracy or error are not the best quality metrics for classification. You will explore a case-study that significantly highlights this issue: using sentiment analysis to display positive reviews on a restaurant website. Instead of...
- Week 7 - Scaling to Huge Datasets & Online Learning
With the advent of the internet, the growth of social media, and the embedding of sensors in the world, the magnitudes of data that our machine learning algorithms must handle have grown tremendously over the last decade. This effect is sometimes called "Big D...
Amazon Professor of Machine Learning
Computer Science and Engineering
Amazon Professor of Machine Learning
Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world.
Coursera is a digital company offering massive open online course founded by computer teachers Andrew Ng and Daphne Koller Stanford University, located in Mountain View, California.
Coursera works with top universities and organizations to make some of their courses available online, and offers courses in many subjects, including: physics, engineering, humanities, medicine, biology, social sciences, mathematics, business, computer science, digital marketing, data science, and other subjects.