Data Mining Project
link 来源
list 6个序列
assignment 等级:入门
chat_bubble_outline 语言:英语
card_giftcard 192分


credit_card 免费进入
verified_user 收费证书
timer 24小时总数


Note: You should complete all the other courses in this Specialization before beginning this course. This six-week long Project course of the Data Mining Specialization will allow you to apply the learned algorithms and techniques for data mining from the previous courses in the Specialization, including Pattern Discovery, Clustering, Text Retrieval, Text Mining, and Visualization, to solve interesting real-world data mining challenges. Specifically, you will work on a restaurant review data set from Yelp and use all the knowledge and skills you’ve learned from the previous courses to mine this data set to discover interesting and useful knowledge. The design of the Project emphasizes: 1) simulating the workflow of a data miner in a real job setting; 2) integrating different mining techniques covered in multiple individual courses; 3) experimenting with different ways to solve a problem to deepen your understanding of techniques; and 4) allowing you to propose and explore your own ideas creatively. The goal of the Project is to analyze and mine a large Yelp review data set to discover useful knowledge to help people make decisions in dining. The project will include the following outputs: 1. Opinion visualization: explore and visualize the review content to understand what people have said in those reviews. 2. Cuisine map construction: mine the data set to understand the landscape of different types of cuisines and their similarities. 3. Discovery of popular dishes for a cuisine: mine the data set to discover the common/popular dishes of a particular cuisine. 4. Recommendation of restaurants to help people decide where to dine: mine the data set to rank restaurants for a specific dish and predict the hygiene condition of a restaurant. From the perspective of users, a cuisine map can help them understand what cuisines are there and see the big picture of all kinds of cuisines and their relations. Once they decide what cuisine to try, they would be interested in knowing what the popular dishes of that cuisine are and decide what dishes to have. Finally, they will need to choose a restaurant. Thus, recommending restaurants based on a particular dish would be useful. Moreover, predicting the hygiene condition of a restaurant would also be helpful. By working on these tasks, you will gain experience with a typical workflow in data mining that includes data preprocessing, data exploration, data analysis, improvement of analysis methods, and presentation of results. You will have an opportunity to combine multiple algorithms from different courses to complete a relatively complicated mining task and experiment with different ways to solve a problem to understand the best way to solve it. We will suggest specific approaches, but you are highly encouraged to explore your own ideas since open exploration is, by design, a goal of the Project. You are required to submit a brief report for each of the tasks for peer grading. A final consolidated report is also required, which will be peer-graded.

more_horiz 查看更多
more_horiz 收起


The Capstone project will include several peer graded tasks and a final report.



Jiawei Han
Abel Bliss Professor
Department of Computer Science

ChengXiang Zhai
Department of Computer Science

John C. Hart
Professor of Computer Science
Department of Computer Science



University of Illinois at Urbana-Champaign

伊利诺伊大学香槟分校(UIUC)成立于 1867 年。伊利诺伊大学的主校区位于芝加哥以南 200 公里处的香槟和厄巴纳双城。

根据世界大学排名中心(Center for World University Rankings)等多项排名,这所重点大学跻身全球最负盛名的大学之列,2020-21 年的全球排名为第 22 位。




Coursera是一家数字公司,提供由位于加利福尼亚州山景城的计算机教师Andrew Ng和达芙妮科勒斯坦福大学创建的大型开放式在线课程。

Coursera与顶尖大学和组织合作,在线提供一些课程,并提供许多科目的课程,包括:物理,工程,人文,医学,生物学,社会科学,数学,商业,计算机科学,数字营销,数据科学 和其他科目。

您是 MOOC 的设计者?