About the content
The world and internet are full of textual information. We search for information using textual queries and read websites, books and e-mails.
These are all strings from a computer science point of view. To make sense of all this information and make search efficient, search engines use many string algorithms. Moreover, the emerging field of personalized medicine uses many search algorithms to find disease-causing mutations in the human genome.
In this course, part of the Algorithms and Data Structures MicroMasters program, you will learn about:
- suffix trees;
- suffix arrays;
- how other brilliant algorithmic ideas help doctors to find differences between genomes;
- power lightning-fast Internet searches.
- Key ideas for pattern matching and suffix trees
- Suffix arrays
- Burrows-Wheeler Transform for compression
- Applications of string algorithms in bioinformatics
Weeks 1 and 2: Suffix Trees
How would you search for a longest repeat in a string in LINEAR time? In 1973, Peter Weiner came up with a surprising solution that was based on suffix trees, the key data structure in pattern matching. Computer scientists were so impressed with his algorithm that they called it the Algorithm of the Year. In this lesson, we will explore some key ideas for pattern matching that will - through a series of trials and errors - bring us to suffix trees.
Week 3 and 4: Burrows-Wheeler Transform and Suffix Arrays
Although EXACT pattern matching with suffix trees is fast, it is not clear how to use suffix trees for APPROXIMATE pattern matching. In 1994, Michael Burrows and David Wheeler invented an ingenious algorithm for text compression that is now known as Burrows-Wheeler Transform. They knew nothing about genomics, and they could not have imagined that 15 years later their algorithm will become the workhorse of biologists searching for genomic mutations. But what text compression has to do with pattern matching??? In this lesson you will learn that the fate of an algorithm is often hard to predict – its applications may appear in a field that has nothing to do with the original plan of its inventors.
Ronald R. Taylor Professor of Computer Science
The University of California, San Diego
Chief Data Scientist
Harvard University, the Massachusetts Institute of Technology, and the University of California, Berkeley, are just some of the schools that you have at your fingertips with EdX. Through massive open online courses (MOOCs) from the world's best universities, you can develop your knowledge in literature, math, history, food and nutrition, and more. These online classes are taught by highly-regarded experts in the field. If you take a class on computer science through Harvard, you may be taught by David J. Malan, a senior lecturer on computer science at Harvard University for the School of Engineering and Applied Sciences. But there's not just one professor - you have access to the entire teaching staff, allowing you to receive feedback on assignments straight from the experts. Pursue a Verified Certificate to document your achievements and use your coursework for job and school applications, promotions, and more. EdX also works with top universities to conduct research, allowing them to learn more about learning. Using their findings, edX is able to provide students with the best and most effective courses, constantly enhancing the student experience.