Managing Big Data with R and Hadoop

Closed
Course
en
English
20 h
This content is rated 0 out of 5
Source
  • From www.futurelearn.com
More info
  • 5 Sequences
  • Introductive Level
  • Starts on May 5, 2019
  • Ends on June 8, 2019

Their employees are learning daily with Edflex

  • Safran
  • Air France
  • TotalEnergies
  • Generali
Learn more

Course details

Syllabus

What topics will you cover?

  1. Welcome to BIG DATA
  2. Working with Hadoop
  3. First steps in R and RHadoop
  4. Statistical learning with RHadoop: clustering
  5. Statistical learning with RHadoop: regression and classification   

Prerequisite

This course is designed for people interested in data science, computational statistics and machine learning and have basic experiences with them. It will be also useful for advanced undergraduate students and first year PhD students in data analysis, statistics or bioinformatics, who wish to understand how to manage big data with Hadoop using R programming language.

We expect that the learners will also have basic experiences with linux, bash and R and are capable to download and run virtual machine.

What software or tools do you need?

All software needed to actively participate the course is provided within the virtual machine that the followers are supposed to download and run on the local machine. No extra software is needed. You will need a modest local machine with 15GB free disk space and 2GB RAM.

Instructors

Janez Povh
I am an active researcher in mathematical optimization, which has many applications in data science and where HPC is an inevitable tool.


Biljana Mileva Boshkoska
Biljsna Mileva Boshkoska is an assistant professor in computer science. Her interests include decision support systems, data mining and working with big data.


Leon Kos
Leon Kos is a 25+ years veteran of using Linux desktop on a daily basis to build digital relationships for research, teaching, and getting the job done by programming.

Editor

The Partnership for Advanced Computing in Europe (PRACE) is an international non-profit association with its seat in Brussels. The PRACE Research Infrastructure provides a persistent world-class high performance computing service for scientists and researchers from academia and industry in Europe.

The computer systems and their operations accessible through PRACE are provided by 4 PRACE members (BSC representing Spain, CINECA representing Italy, CSCS representing Switzerland, GCS representing Germany and GENCI representing France). The Implementation Phase of PRACE receives funding from the EU’s Seventh Framework Programme (FP7/2007-2013) under grant agreement RI-312763 and from the EU’s Horizon 2020 Research and Innovation Programme (2014-2020) under grant agreement 653838.

Platform

FutureLearn is a massive open online course (MOOC) learning platform founded in December 2012.

It is a company launched and wholly owned by The Open University in Milton Keynes, England. It is the first UK-led massive open online course learning platform, and as of March 2015 included 54 UK and international University partners and unlike similar platforms includes four non-university partners: the British Museum, the British Council, the British Library and the National Film and Television School.

This content is rated 4.5 out of 5
(no review)
This content is rated 4.5 out of 5
(no review)
Complete this resource to write a review