Big Data Science with the BD2K-LINCS Data Coordination and Integration Center
link Source :
list 7 séquences
assignment Niveau : Intermédiaire
chat_bubble_outline Langue : Anglais
card_giftcard 280 points
Logo My Mooc Business

Ils choisissent Edflex pour développer les compétences en entreprise.

En savoir plus
Avis de la communauté
Voir l'avis

Les infos clés

credit_card Formation gratuite
verified_user Certification payante
timer 28 heures de cours

En résumé

The Library of Integrative Network-based Cellular Signatures (LINCS) is an NIH Common Fund program. The idea is to perturb different types of human cells with many different types of perturbations such as: drugs and other small molecules; genetic manipulations such as knockdown or overexpression of single genes; manipulation of the extracellular microenvironment conditions, for example, growing cells on different surfaces, and more. These perturbations are applied to various types of human cells including induced pluripotent stem cells from patients, differentiated into various lineages such as neurons or cardiomyocytes. Then, to better understand the molecular networks that are affected by these perturbations, changes in level of many different variables are measured including: mRNAs, proteins, and metabolites, as well as cellular phenotypic changes such as changes in cell morphology. The BD2K-LINCS Data Coordination and Integration Center (DCIC) is commissioned to organize, analyze, visualize and integrate this data with other publicly available relevant resources. In this course we briefly introduce the DCIC and the various Centers that collect data for LINCS. We then cover metadata and how metadata is linked to ontologies. We then present data processing and normalization methods to clean and harmonize LINCS data. This follow discussions about how data is served as RESTful APIs. Most importantly, the course covers computational methods including: data clustering, gene-set enrichment analysis, interactive data visualization, and supervised learning. Finally, we introduce crowdsourcing/citizen-science projects where students can work together in teams to extract expression signatures from public databases and then query such collections of signatures against LINCS data for predicting small molecules as potential therapeutics.

more_horiz Lire plus
more_horiz Lire moins

Le programme

  • Week 1 - The Library of Integrated Network-based Cellular Signatures (LINCS) Program Overview
    This module provides an overview of the concept behind the LINCS program; and tutorials on how to get started with using the LINCS L1000 dataset.
  • Week 1 - Metadata and Ontologies
    This module includes a broad high level description of the concepts behind metadata and ontologies and how these are applied to LINCS datasets.
  • Week 1 - Serving Data with APIs
    In this module we explain the concept of accessing data through an application programming interface (API).
  • Week 2 - Bioinformatics Pipelines
    This module describes the important concept of a Bioinformatics pipeline.
  • Week 2 - The Harmonizome
    This module describes a project that integrates many resources that contain knowledge about genes and proteins. The project is called the Harmonizome, and it is implemented as a web-server application available at:
  • Week 3 - Data Normalization
    This module describes the mathematical concepts behind data normalization.
  • Week 3 - Data Clustering
    This module describes the mathematical concepts behind data clustering, or in other words unsupervised learning - the identification of patterns within data without considering the labels associated with the data.
  • Week 3 - Midterm Exam
    The Midterm Exam consists of 45 multiple choice questions which covers modules 1-7. Some of the questions may require you to perform some analysis with the methods you learned throughout the course on new datasets.
  • Week 4 - Enrichment Analysis
    This module introduces the important concept of performing gene set enrichment analyses. Enrichment analysis is the process of querying gene sets from genomics and proteomics studies against annotated gene sets collected from prior biological knowledge.
  • Week 4 - Machine Learning
    This module describes the mathematical concepts of supervised machine learning, the process of making predictions from examples that associate observations/features/attribute with one or more properties that we wish to learn/predict.
  • Week 5 - Benchmarking
    This module discusses how Bioinformatics pipelines can be compared and evaluated.
  • Week 5 - Interactive Data Visualization
    This module provides programming examples on how to get started with creating interactive web-based data visualization elements/figures.
  • Week 6 - Crowdsourcing Projects
    This final module describes opportunities to work on LINCS related projects that go beyond the course.
  • Week 7 - Final Exam
    The Final Exam consists of 60 multiple choice questions which covers all of the modules of the course. Some of the questions may require you to perform some analysis with the methods you learned throughout the course on new datasets.

Les intervenants

Avi Ma’ayan, PhD
Director, Mount Sinai Center for Bioinformatics
Professor, Department of Pharmacological Sciences


Le concepteur

Icahn School of Medicine at Mount Sinai
The Icahn School of Medicine at Mount Sinai, formerly the Mount Sinai School of Medicine, is an American medical school in the borough of Manhattan in New York City. The Graduate School of Biomedical Sciences provides rigorous training in basic science and clinical research.

La plateforme


Coursera est une entreprise numérique proposant des formations en ligne ouverte à tous fondée par les professeurs d'informatique Andrew Ng et Daphne Koller de l'université Stanford, située à Mountain View, Californie.

Ce qui la différencie le plus des autres plateformes MOOC, c'est qu'elle travaille qu'avec les meilleures universités et organisations mondiales et diffuse leurs contenus sur le web.

Vous êtes le concepteur de ce MOOC ?
Quelle note donnez-vous à cette ressource ?