Introduction to Apache Hadoop
list 15 séquences
assignment Niveau : Introductif
chat_bubble_outline Langue : Anglais
card_giftcard 27 points
Envie de partager ce MOOC dans votre entreprise ?
My Mooc
For Business
- /5
Avis de la communauté
0 avis

Les infos clés

credit_card Formation gratuite
verified_user Certification payante
timer 45 heures de cours

En résumé

Everywhere you look today, enterprises are embracing big data-driven customer relationships and building innovative solutions based on insights gained from data. According to IBM, every day we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals, just to name a few. This data is big data.

The demand for storing this unprecedented amount of information is enough of a challenge, but when you add the need for analytics, the technology requirements truly start pushing the envelope on state-of-the-art IT infrastructures. Fortunately, the Open Source community has stepped up to this challenge and developed a storage and processing layer called Apache Hadoop. Add the dozens of other projects integrating with Apache Hadoop and you have the whole Hadoop ecosystem.

The Hadoop ecosystem, along with the data management architectures it enables, is growing at an unprecedented rate, with 73% of Hadoop cluster deployments now in production — a number which continues to rise.

The demand for individuals who have experience managing this platform is also accelerating. According to the IT Skills and Certifications Pay Index research from Foote Partners, “the need for big data skills also continues to lead to pay increases — about 8% over the last year.” Now is exactly the right time to build an exciting and rewarding career managing big data with Apache Hadoop.

This introductory course is taught by Hadoop experts from The Linux Foundation’s ODPi collaborative project. As host to some of the world's leading open source projects, The Linux Foundation provides training and networking opportunities to help you advance your career.

This course is perfect for IT professionals seeking a high-level overview of Hadoop, and who want to find out if a Hadoop-driven big data strategy is the right solution to meet their data retention and analytics needs. This course will also help anyone who wants to set up a small-scale Hadoop test environment to gain experience working with this exciting open source technology.

  • The origins of Apache Hadoop and its big data ecosystem
  • Deploying Hadoop in a clustered environment of a modern day enterprise IT
  • Building data lake management architectures around Apache Hadoop
  • Leveraging the YARN framework to effectively enable heterogeneous  analytical workloads on Hadoop clusters
  • Leveraging Apache Hive for an SQL-centric view into the enterprise data lake
  • An introduction to managing key Hadoop components (HDFS, YARN and Hive) from the command line
  • Securing and scaling your data lakes in multi-tenant enterprise environments

more_horiz Lire plus
more_horiz Lire moins
report_problem

Les prérequis

  • Experience with Linux
  • Basic familiarity with Java applications

dns

Le programme

Welcome and Introduction
Enterprise Data Management: From Relational Databases to Hadoop “Data Lakes”
Understanding Hadoop
Deploying Hadoop
Using Your Hadoop Cluster for Data Management and Analytics
Securing Hadoop
Where to Go from Here
Final Exam

record_voice_over

Les intervenants

Roman Shaposhnik

store

Le concepteur

The Linux Foundation
assistant

La plateforme

EdX est une plateforme d'apprentissage en ligne (dite FLOT ou MOOC). Elle héberge et met gratuitement à disposition des cours en ligne de niveau universitaire à travers le monde entier. Elle mène également des recherches sur l'apprentissage en ligne et la façon dont les utilisateurs utilisent celle-ci. Elle est à but non lucratif et la plateforme utilise un logiciel open source.

EdX a été fondée par le Massachusetts Institute of Technology et par l'université Harvard en mai 2012. En 2014, environ 50 écoles, associations et organisations internationales offrent ou projettent d'offrir des cours sur EdX. En juillet 2014, elle avait plus de 2,5 millions d'utilisateurs suivant plus de 200 cours en ligne.

Les deux universités américaines qui financent la plateforme ont investi 60 millions USD dans son développement. La plateforme France Université Numérique utilise la technologie openedX, supportée par Google.

Vous êtes le concepteur de ce MOOC ?
Quelle note donnez-vous à cette ressource ?
Contenu
0/5
Plateforme
0/5
Animation
0/5