Introduction to Apache Spark and AWS
list 4 sequences
assignment Level : Introductive
chat_bubble_outline Language : English
card_giftcard 160 points
Logo My Mooc Business

Top companies choose Edflex to build in-demand career skills.

Get started
Users' reviews

Key Information

credit_card Free access
verified_user Free certificate
timer 20 hours in total

About the content

Learn to analyze big data using Apache Spark's distributed computing framework. 

In a series of focused, practical tasks, you will start by launching a spark cluster on Amazon's EC2 cloud computing platform. As you progress to working with real data, you will gain exposure to a variety of useful tools, including RDFlib and SPARQL. 
The practical tasks on this course make use of the Gutenberg Project data - the world's largest open collection of ebooks. This offers no end of opportunity for highly engaging and novel analyses.

As the taught material and example code is given in Python, it is strongly recommended that all students have previous Python programming experience. Furthermore, launching and interacting with a cluster on EC2 requires basic knowledge of Unix command line, and some experience with a command-line editor such as vim or nano would also be advantageous.

With these minimal prerequisites, this course is designed to get you up and running in Spark as quickly and painlessly as possible, so that by the end, you will be comfortable and competent enough to start engineering your own big data solutions.

more_horiz Read more
more_horiz Read less


Getting Started in Spark on EC2

Reading and Writing Data

Tools for Working with Data

Programming in Spark



  • Dr Sorrel Harriet, Lecturer
  • Christophe Rhodes, Senior Lecturer

Content Designer

University of London
The University of London is a federal University which includes 17 world leading Colleges. Our International Programmes were founded in 1858 and have enriched the lives of thousands of students, delivering high quality University of London degrees wherever our students are across the globe. Our alumni include 7 Nobel Prize winners. Today, we are a global leader in distance and flexible study, offering degree programmes to over 50,000 students in over 180 countries. To find out more about studying for one of our degrees where you are, search for 'London International'.



Coursera is a digital company offering massive open online course founded by computer teachers Andrew Ng and Daphne Koller Stanford University, located in Mountain View, California. 

Coursera works with top universities and organizations to make some of their courses available online, and offers courses in many subjects, including: physics, engineering, humanities, medicine, biology, social sciences, mathematics, business, computer science, digital marketing, data science, and other subjects.

You are the designer of this MOOC?
What is your opinion on this resource ?