GTPB

The Gulbenkian Training Programme in Bioinformatics


GTPB runs at the Instituto Gulbenkian de Ciência on a yearly basis since 1999.
More than 950 course attendees in 9 years
Pedro Fernandes, GTPB organizer



Microarray data Analysis using

GEPAS and Babelomics

MDAGB08 course website

IMPORTANT DATES
Deadline for applications: May 6th 2008 (updated 0n April 17th 2008)
Course date: May 21st - May 23rd 18th 2008




Instructors:

Joaquín Dopazo has a master degree in Chemistry (Universidad de Valencia) and a PhD in Biology (Universidad de Valencia). He is the head of the Department of Bioinformatics at the CIPF (Valencia). In previous appointments he was responsible for Bioinformatics units at the CNIO (Madrid) and at GlaxoWellcome SA (Madrid). He has supervised several large scale projects of software development, as the GEPAS or the Babelomics (http://www.Babelomics.org) where more than 500 microarray experiments are daily analysed. He has more than one hundred papers published in international peer reviewed journals and has edited a book on genomic data analysis. His main interests include functional and comparative genomics.
David Montaner did his Mathematics degree at the Universidad Complutense of Madrid. For two years he worked as a researcher in the Social Medicine Department of Bristol University. Currently he is a statistician in the Bioinformatics Department of the Centro de Investigacion Principe Felipe in Valencia (Spain) where he is involved in the GEPAS and the Babelomics (http://www.Babelomics.org) projects. His research interests include microarray data analysis as well as functional interpretation of genomic data.
Ignacio Medina did his Biochemistry degree at the Valencia University and the Computer Science degree at the Polytechnic University of Valencia. For two years he worked as a researcher in the Medicine Faculty of Valencia University, and for another two years he worked as a researcher in the Artificial Intelligence Department of Polytechnic University of Valencia. Currently he is a bioinformatician in the Bioinformatics Department of the Centro de Investigacion Principe Felipe in Valencia (Spain) where he is involved in the GEPAS, Babelomics (http://www.Babelomics.org) and Pupasuite (https://omictools.com/pupasuite-tool/) projects. His research interests include microarray data analysis, predictors, SNPs as well as functional interpretation of genomic data.

Bioinformatics Department, CIPF, Valencia.



Course description

DNA microarrays constitute, no doubt, a paradigm among post-genomic technologies, which are characterised for producing large amounts of data, whose analysis and interpretation is not trivial. Microarray technologies allows querying living systems in a completely new way, but at the same time present new challenges in the way hypotheses must be tested and results have to be analysed.

Since the first papers published in the latest nineties the number of questions that have been addressed through this technique have both increased and diversified. Initial interest was focused on genes coexpressing across sets of experimental conditions, implying essentially the use of clustering techniques. More recently, however, the interest has switched to find genes differentially expressed among distinct classes of experiments, or correlated to diverse parameters. There is also much interest in robust methods for building predictors of clinical outcomes. Also, CGH-arrays (Albertson and Pinkel, 2003) are recently becoming an alternative for studying the relationship between chromosomal alterations affecting to copy number (which are behind many diseases) and gene expression. In addition, there is also a clear demand for methods that allow automatic transfer of biological information to the results of microarray experiments.

This course covers the state-of-the-art in the above mentioned topics, which are of major relevance in today's gene expression data analysis. Through sessions of theory and practical examples, the students will acquire the experience necessary to address scientific questions to gene expression array datasets and solve them. Special attention will be devoted to important (despite frequently ignored) aspects in microarray data analysis, such as multiple testing or functional annotation. The course is designed to be a mixture of theoretical and practical sessions. The latter will require some familiarity with the use of web-based tools and knowledge of basics notions of statistics. Practical sessions will be carried out using the GEPAS (Herrero et al., 2003, 2004; Vaquerizas et al., 2005) environment, an integrated web tool for microarray data analysis, and the Babelomics suite (Al-Shahrour et al., 2005) for functional annotation of genome-scale experiments.


Course Timetable (provisional):


Acronym Course Title
May 21st
Day #1
09:30 - 11:00 Introduction

Why microarrays? Pre- and post-genomics hypothesis testing: a note of caution. Design of experiments. Data preprocessing and normalization. Unsupervised analysis (clustering). Supervised analysis (gene selection, predictors). Functional annotation.

11:00 - 11:30 Coffee Break
11:30 - 12:30 Normalization (theory)

Getting rid of unwanted variability from sources other than the experimental conditions assayed. Methods for Affymetrix and two-colours microarrays.

12:30 - 14:00 Lunch Break
14:00 - 16:00 Normalization (practical session)

Normalization of Affymetrix and two-colour arrays

16:00 - 16:30 Tea Break
16:30 - 18:00 Gene selection (theory)

Methods for selecting differentially expressed genes among two or more experimental conditions, correlated to a continuous variable or correlated to survival. How to deal with the multiple-testing problem.

Gene selection (practical session)

Practical exercises with different methods and different types of gene selection problems.

May 22nd
Day #2
09:30 - 11:00 Predictors (theory)

Gene selection in the context of class prediction. How to deal with the selection bias problem.

11:00 - 11:30 Coffee Break
11:30 - 12:30 Predictors (practical session)

Building class predictors with different methods.

12:30 - 14:00 Lunch Break
14:00 - 16:00 Clustering (theory)

An overview of different clustering methods: hierarchical clustering, SOM, SOTA and k-means

16:00 - 16:30 Tea Break
16:30 - 18:00 Clustering (practical session)

Practical exercises with different clustering methods: hierarchical clustering, SOM, SOTA and k-means

May 23rd
Day #3
09:30 - 11:00 Functional interpretation of microarray experiments (theory)

Understanding the biological roles played by the genes in the experiments. Using different types of information for the functional annotation of microarray experiments: gene ontology, InterPro motifs, transcription factor binding sites, gene expression in other experiments, etc. The babelomics suite. How to deal (again) with the multiple testing problem.

11:00 - 11:30 Coffee Break
11:30 - 12:30 Functional interpretation of microarray experiments

The Babelomics suite. Different methods and criteria for the functional interpretation of microarray experiments. FatiGO and FatiScan from the Babelomics suite.

12:30 - 14:00 Lunch Break
14:00 - 16:00 Practical exercises covering all the aspects taught

16:00 - 16:30 Tea Break
16:30 - 18:00 Correction of the exercises and get together.


Instituto Gulbenkian de Ciência, Apartado14, 2781-901 Oeiras, Portugal


GTPB Homepage
IGC Homepage
Last updated:Apr 17th 2008