link 来源
list 6个序列
assignment 等级:入门
chat_bubble_outline 语言:英语
card_giftcard 192分
Logo My Mooc Business




credit_card 免费进入
verified_user 收费证书
timer 24小时总数


Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text. This course will cover search engine technologies, which play an important role in any data mining applications involving text data for two reasons. First, while the raw data may be large for any particular problem, it is often a relatively small subset of the data that are relevant, and a search engine is an essential tool for quickly discovering a small subset of relevant text data in a large text collection. Second, search engines are needed to help analysts interpret any patterns discovered in the data by allowing them to examine the relevant original text data to make sense of any discovered pattern. You will learn the basic concepts, principles, and the major techniques in text retrieval, which is the underlying science of search engines.

more_horiz 查看更多
more_horiz 收起


  • Week 1 - Orientation
    You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.
  • Week 1 - Week 1
    During this week's lessons, you will learn of natural language processing techniques, which are the foundation for all kinds of text-processing applications, the concept of a retrieval model, and the basic idea of the vector space model.
  • Week 2 - Week 2
    In this week's lessons, you will learn how the vector space model works in detail, the major heuristics used in designing a retrieval function for ranking documents with respect to a query, and how to implement an information retrieval system (i.e., a search e...
  • Week 3 - Week 3
    In this week's lessons, you will learn how to evaluate an information retrieval system (a search engine), including the basic measures for evaluating a set of retrieved results and the major measures for evaluating a ranked list, including the average precisio...
  • Week 4 - Week 4
    In this week's lessons, you will learn probabilistic retrieval models and statistical language models, particularly the detail of the query likelihood retrieval function with two specific smoothing methods, and how the query likelihood retrieval function is co...
  • Week 5 - Week 5
    In this week's lessons, you will learn feedback techniques in information retrieval, including the Rocchio feedback method for the vector space model, and a mixture model for feedback with language models. You will also learn how web search engines work, inclu...
  • Week 6 - Week 6
    In this week's lessons, you will learn how machine learning can be used to combine multiple scoring factors to optimize ranking of documents in web search (i.e., learning to rank), and learn techniques used in recommender systems (also called filtering systems...


ChengXiang Zhai
Department of Computer Science



University of Illinois at Urbana-Champaign
The University of Illinois at Urbana-Champaign is a world leader in research, teaching and public engagement, distinguished by the breadth of its programs, broad academic excellence, and internationally renowned faculty and alumni. Illinois serves the world by creating knowledge, preparing students for lives of impact, and finding solutions to critical societal needs.



Coursera是一家数字公司,提供由位于加利福尼亚州山景城的计算机教师Andrew Ng和达芙妮科勒斯坦福大学创建的大型开放式在线课程。

Coursera与顶尖大学和组织合作,在线提供一些课程,并提供许多科目的课程,包括:物理,工程,人文,医学,生物学,社会科学,数学,商业,计算机科学,数字营销,数据科学 和其他科目。

您是 MOOC 的设计者?