Microarray Data Analysis using R and Bioconductor

   Deadline for applications: August 30th 2011
   Notification of acceptance dates:
        EARLY: August 15th 2011
        NORMAL:August 30th 2011
   Course date: September 5th to September 9th 2011


Oscar M. Rueda completed a BSc in Statistics and a MSc in Statistics at Universidad de Valladolid (Spain) in 2001. He started working as an Applied Statistician for a company and lectured several courses in Mathematics and Statistics at the University of Valladolid. He then moved to the Spanish National Cancer Centre (CNIO) in Madrid and finished his PhD in Mathematics (Statistics) with Dr. Ramon Diaz-Uriarte and Dr. Cristina Rueda in 2008. His PhD work involved developing Hidden Markov Models for the analysis of copy number data for different aCGH and SNP platforms. Oscar is currently a post doctoral researcher in the Functional Genomics Breast Cancer Group, under the direction of Professor Carlos Caldas, at the Cambridge Research Institute (Cancer Research UK). His main interest is in developing statistical models to answer biological questions related to cancer tumorigenesis. In particular, he analyzes SNP data from breast cancer tumors and designs novel statistical methods for the integration of genomic and transcriptomic microarray data sets. He was one of the instructors in the MDARB10 course of GTPB.

Affiliation: University of Cambridge, Cambridge, UK

Benilton Carvalho graduated from Universidade Estadual de Campinas (Campinas, Brazil) with a BSc and MSc in Statistics. He did his PhD with Professor Rafael Irizarry at the Department of Biostatistics - Johns Hopkins University (Baltimore, USA). During the PhD, he worked on the development of efficient software and methodology for the low-level analysis of high-density oligonucleotide arrays. He was an intern at Affymetrix and a consultant for NimbleGen. He is an active BioConductor developer, maintaining several infrastructure, analysis and annotation packages. Benilton is currently a Research Associate in the Computational Biology Group (Department of Oncology, University of Cambridge, UK) and is involved with the development of high-performance solutions for the analysis of microarray data and with the investigation of statistical methods for next-generation sequencing technologies. He was one of the instructors in the MDARB10 course of GTPB.

Affiliation: University of Cambridge, Cambridge, UK

Ana Rita Grosso graduated in 2001 at the Faculdade de Ciências (FCUL) at the Universidade de Lisboa with a BSc in Biology. From 2001 to 2004 her work focused on phylogeny and population genetics, essentially applied to animal conservation at FCUL. In 2004 she participated in the FCUL/IGC PGBIOINF Post-Graduation in Bioinformatics. In her MSc dissertation also at FCUL in 2006, she assessed statistical methodologies for the analysis of DNA microarray data. She did her PhD with Prof. Maria Carmo-Fonseca (Faculdade de Medicina da Universidade de Lisboa) and Prof. Simon Tavaré (University of Cambridge - Department of Oncology) in 2010. During the PhD, she studied alternative splicing using splicing-sensitive microarrays. Ana Rita is currently a pos-doctoral researcher in Maria Carmo-Fonseca's group (Instituto de Medicina Molecular, Lisboa) and Benjamin Blencowe's group (University of Toronto). Her current projects involve analysis of next generation sequencing data.

Affiliation: Instituto de Medicina Molecular, FMUL, Lisboa

Course description:

This course aims to introduce researchers to a multidisciplinary approach to microrray data analysis. Particular attention is devoted to the design of microarray experiments, data normalization and quality control as well as to statistical analysis. Participants might find the provided basic training invaluable for: how to approach designing microarray experiments planned in their lab; gaining knowledge and understanding of microarray analysis and quality issues; gaining confidence in performing preprocessing, quality assessment, and differential expression and downstream analysis using the statistical software environment R and some R libraries in Bioconductor, namely limma. The course also covers more specific topics, such as the analysis of Illumina and Affymetrix, copy number and SNP analysis, as well as Next Generation Sequencing (RNA-seq).

Target audience: All aspects of the course are aimed at non-statisticians, suitable for beginners in microarrays as well as those who have already been working in genomics. The course may also be useful to computational biologists new to microarray analysis. The course is hands-on and intensive, so a highly motivated group of trainees, looking forward to dealing with microarray data in the near future, is expected.

Course Pre-requisites:

Basic Molecular Biology, Elementary level Statistics. The participants are also requested to do Practical Introduction to R in advance, ideally just before the course.This tutorial takes less than 30 minutes to follow.
To install R locally, go to
Click on CRAN, select a mirror site and install R locally. It is available for Linux, Microsoft Windows and Apple MacOS X.
Links to previous editions:
2010 2009 2008

Detailed Program

Instituto Gulbenkian de Ciência,

Apartado 14, 2781-901 Oeiras, Portugal

GTPB Homepage

IGC Homepage

Last updated:  July 20th 2011