En résumé



Learn both theory and application for basic methods that have been invented either for developing new concepts – principal components or clusters, or for finding interesting correlations – regression and classification. This is preceded by a thorough analysis of 1D and 2D data.

Le programme

Week 1. Intro: Examples of data and data analysis problems; visualization.       


Week 2. 1D analysis. Feature scales. Histogram. Two common types of histograms: Gaussian and Power Law. Central values. Minkowski distance and data recovery view. Validation with Bootstrap.           


Week 3-4. 2D analysis cases:

(Both quantitative: Scatter-plot, linear regression, correlation and determinacy coefficients: meaning and properties. Both nominal: Contingency table, Quetelet index, Pearson chi-squared coefficient, its double meaning and visualization).                                                              

Week 5-6. Learning multivariate correlations

(Bayes approach and Naïve Bayes classifier with a Bag-of-words text model; Decision trees and criteria for building them.)                      


Week 7. Principal components (PCA) and SVD

(SVD model behind PCA: student marks as the product of subject factor scores and subject loadings. Application to deriving a hidden underlying factor. Data visualization with PCA. Conventional PCA and data normalization issues.)


Week 8. Clustering with k-means

(K-Means iterations and K-Means features   

K-Means criterion. Anomalous clusters and intelligent K-Means.)


Les intervenants

  • Boris Mirkin - Department of Data Analysis and Artificial Intelligence

Le concepteur

National Research University - Higher School of Economics (HSE) is one of the top research universities in Russia. Established in 1992 to promote new research and teaching in economics and related disciplines, it now offers programs at all levels of university education across an extraordinary range of fields of study including business, sociology, cultural studies, philosophy, political science, international relations, law, Asian studies, media and communications, IT, mathematics, engineering, and more. Learn more on

La plateforme

Coursera est une entreprise numérique proposant des formation en ligne ouverte à tous fondée par les professeurs d'informatique Andrew Ng et Daphne Koller de l'université Stanford, située à Mountain View, Californie.

Ce qui la différencie le plus des autres plateformes MOOC, c'est qu'elle travaille qu'avec les meilleures universités et organisations mondiales et diffuse leurs contenus sur le web.

