High Performance Computing
link 来源:www.udacity.com
list 24个序列
assignment 等级:入门
chat_bubble_outline 语言:英语
card_giftcard 1分
评论
-
starstarstarstarstar
0条评论

关键信息

credit_card 免费进入

关于内容

The goal of this course is to give you solid foundations for developing, analyzing, and implementing parallel and locality-efficient algorithms. This course focuses on theoretical underpinnings. To give a practical feeling for how algorithms map to and behave on real systems, we will supplement algorithmic theory with hands-on exercises on modern HPC systems, such as Cilk Plus or OpenMP on shared memory nodes, CUDA for graphics co-processors (GPUs), and MPI and PGAS models for distributed memory systems. This course is a graduate-level introduction to scalable parallel algorithms. “Scale” really refers to two things: efficient as the problem size grows, and efficient as the system size (measured in numbers of cores or compute nodes) grows. To really scale your algorithm in both of these senses, you need to be smart about reducing asymptotic complexity the way you’ve done for sequential algorithms since CS 101; but you also need to think about reducing communication and data movement. This course is about the basic algorithmic techniques you’ll need to do so. The techniques you’ll encounter covers the main algorithm design and analysis ideas for three major classes of machines: for multicore and many core shared memory machines, via the work-span model; for distributed memory machines like clusters and supercomputers, via network models; and for sequential or parallel machines with deep memory hierarchies (e.g., caches). You will see these techniques applied to fundamental problems, like sorting, search on trees and graphs, and linear algebra, among others. The practical aspect of this course is implementing the algorithms and techniques you’ll learn to run on real parallel and distributed systems, so you can check whether what appears to work well in theory also translates into practice. (Programming models you’ll use include Cilk Plus, OpenMP, and MPI, and possibly others.)

more_horiz 查看更多
more_horiz 收起
dns

课程大纲

The course topics are centered on three different ideas or extensions to the usual serial RAM model you encounter in CS 101. Recall that a serial RAM assumes a sequential or serial processor connected to a main memory. * Unit 1: The work-span or dynamic multithreading model In this model, the idea is that there are multiple processors connected to the main memory. Since they can all “see” the same memory, the processors can coordinate and communicate via reads and writes to that “shared” memory. Sub-topics include: ** Intro to the basic algorithmic model ** Intro to OpenMP, a practical programming model ** Comparison-based sorting algorithms ** Scans and linked list algorithms ** Tree algorithms ** Graph algorithms, e.g., breadth-first search * Unit 2: Distributed memory or network models In this model, the idea is that there is not one serial RAM, but many serial RAMs connected by a network. In this model, each serial RAM’s memory is private to the other RAMs; consequently, the processors must coordinate and communicate by sending and receiving messages. Sub-topics include: ** The basic algorithmic model ** Intro to the Message Passing Interface, a practical programming model ** Reasoning about the effects of network topology ** Dense linear algebra ** Sorting ** Sparse graph algorithms ** Graph partitioning * Unit 3: Two-level memory or I/O models In this model, we return to a serial RAM, but instead of having only a processor connected to a main memory, there is a smaller but faster scratchpad memory in between the two. The algorithmic question here is how to use the scratchpad effectively, in order to minimize costly data transfers from main memory. Sub-topics include: ** Basic models ** Efficiency metrics, including “emerging” metrics like energy and power ** I/O-aware algorithms ** Cache-oblivious algorithms
record_voice_over

教师

  • Rich Vuduc - Rich Vuduc an associate professor in the School of Computational Science and Engineering (CSE) atGeorgia Tech. His research is in the area of high-performance computing. This year, Professor Vuduc is also serving as both the Associate Chair of Academic Affairs in the School of CSE and as the Director of CSE Programs. Research: The HPC Garage [@hpcgarage]. Professor Vuduc’s lab is developing automated tools and techniques to tune, to analyze, and to debug software for parallel machines, including emerging high-end multi/manycore architectures and accelerators. They focus on applying these methods to CSE applications, which include computer-based simulation of natural and engineered systems and data analysis.
store

内容设计师

Georgia Institute of Technology

佐治亚理工学院(Georgia Institute of Technology),又称佐治亚理工学院或 GT,是一所位于美国佐治亚州亚特兰大市的男女同校公立研究型大学。它是佐治亚大学系统网络的一部分。佐治亚理工学院在萨凡纳(美国佐治亚州)、梅斯(法国)、阿斯隆(爱尔兰)、上海(中国)和新加坡设有办事处。

佐治亚理工学院的工程和计算机科学课程在世界上名列前茅5,6 ,声誉卓著。此外,佐治亚理工学院还开设了科学、建筑、人文和管理课程。

assistant

平台

Udacity

Udacity est une entreprise fondé par Sebastian Thrun, David Stavens, et Mike Sokolsky offrant cours en ligne ouvert et massif.

Selon Thrun, l'origine du nom Udacity vient de la volonté de l'entreprise d'être "audacieux pour vous, l'étudiant ". Bien que Udacity se concentrait à l'origine sur une offre de cours universitaires, la plateforme se concentre désormais plus sur de formations destinés aux professionnels.

您是 MOOC 的设计者?
您对这门课的评价是?
内容
5/5
平台
5/5
动画
5/5