Exploratory Data Analysis
list 4 sequences
assignment Level : Introductive
chat_bubble_outline Language : English
language Subtitles : Vietnamese, Chinese
card_giftcard 1 point
Users' reviews
4.2
starstarstarstarstar
89 reviews

Key information

credit_card Free access
verified_user Fee-based Certificate

About the content

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

more_horiz Read more
more_horiz Read less
dns

Syllabus

  • Week 1 - Week 1
    This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already.
  • Week 2 - Week 2
    Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was par...
  • Week 3 - Week 3
    Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dime...
  • Week 4 - Week 4
    This week, we'll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I'm pr...
record_voice_over

Instructors

Roger D. Peng, PhD
Associate Professor, Biostatistics
Bloomberg School of Public Health

Jeff Leek, PhD
Associate Professor, Biostatistics
Bloomberg School of Public Health

Brian Caffo, PhD
Professor, Biostatistics
Bloomberg School of Public Health

store

Content designer

Johns Hopkins University
The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.
assistant

Platform

Coursera

Coursera is a digital company offering massive open online course founded by computer teachers Andrew Ng and Daphne Koller Stanford University, located in Mountain View, California. 

Coursera works with top universities and organizations to make some of their courses available online, and offers courses in many subjects, including: physics, engineering, humanities, medicine, biology, social sciences, mathematics, business, computer science, digital marketing, data science, and other subjects.

Reviews
4.2 /5 Average
starstarstarstarstar
48
starstarstarstarstar
23
starstarstarstarstar
10
starstarstarstarstar
6
starstarstarstarstar
2
Content
4.2/5
Platform
4.2/5
Animation
4.2/5
Best review

Highly recommended course for budding data scientist. I loved the John Hopkins univ pedagogy and peer review system. The content is great.

Published on February 18, 2018
You are the designer of this MOOC?
What is your opinion on this resource ?
Content
0/5
Platform
0/5
Animation
0/5
on the February 22, 2018
starstarstarstarstar

This is a very good course, at times it felt like the instruction was to do things mechanically without understanding the motivation. Perhaps this should come after or in conjunction with Statistical Inference

on the February 18, 2018
starstarstarstarstar

Highly recommended course for budding data scientist. I loved the John Hopkins univ pedagogy and peer review system. The content is great.

on the February 12, 2018
starstarstarstarstar

This is the worst of the Data Science courses so far (they've all been pretty good up to this point).It's called Exploratory Data Analysis, but is actually all about the graphics systems in R. And it does a botched job on those as well.All quizzes and assignments are about the graphics systems. The only portion of the course that deviates from that is Week 3 (for which there is no quiz or project) where we "learn" about clustering and dimension reduction. However, that material is presented really poorly: not enough depth for someone who is already familiar with the subject matter; and not nearly well enough explained for newbies.On the graphics side, none of the systems is explored in great depth. The lattice system is essentially just mentioned in passing. To cap it all off, the brief for the last assignment is really ambiguous, which often causes perfectly valid work to be graded poorly by peers. (Just look at the forums, if you need proof.)

on the February 6, 2018
starstarstarstarstar

This was a great course. I learned how to use several graphic systems within R, and to imagine how to make clear answers to questions using plots.

on the January 7, 2018
starstarstarstarstar

It's a very good course. Week 3 was a little bit more challenging than expected, as well as assignment 2, but you get a good idea of how to use all the different plotting systems