ABSTAT17

Advanced Biostatistics for Bioinformatics Tool Users using R

r

   IMPORTANT DATES for this Course
   Deadline for applications: April 3rd 2017
   Course date: April 10th - 13th 2017

Instructors:

Lisete Sousa is an Associate Professor in Faculty of Sciences of the University of Lisbon (FCUL) and coordinates the Centre of Statistics and Applications (CEAUL) since January 2017. She obtained her PhD in Probability and Statistics (Lisbon University and Dortmund University - Germany) in 2004, and since then she has taught the course Statistical Methods in Bioinformatics in the University of Lisbon, among other statistical courses for BSc. and MSc. Her main research interests are the development of statistical methodologies for detecting differentially expressed genes, and the prediction of transmembrane proteins topology based on hidden Markov models but adopting a Bayesian perspective. She coordinated the MSc in Biostatistics at FCUL, from 2009 to 2016, and collaborates with researchers from the Institute of Molecular Medicine (IMM-FMUL) and from the Center for Biodiversity, Functional and Integrative Genomics (BioFIG-FCUL) in the development of statistical methods for the analysis of Microarray and RNA-Seq data. She has been a GTPB instructor in the BFB course in 2010 and 2012, and the ABSTAT course in 2014.
Affiliation: Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal.

Carina Silva graduated in Statistics and Operational Research and holds a MSc degree in Probability and Statistics from Faculdade de Ciências da Universidade de Lisboa. She is Adjunct Professor at Escola Superior de Tecnologia da Saúde de Lisboa (ESTeSL - IPL) and teaches Applied Mathematics and Biostatistics there. She obtained a PhD in Probability and Statistics in 2012. She has coordinated the course "Statistical Data Analysis with R" at ESTeSL. She is a member and collaborator of CEAUL (Center of Statistics and Applications). She is the Coordinator of the scientific area of Mathematics of ESTeSL. Her main research interests are Statistics in Health Sciences, ROC analysis, Molecular Genetics (Microarray Data Analysis) and Bayesian Statistics. She has been a GTPB instructor in the BFB course in 2010, 2011 and 2012.
graduated in Statistics and Operational Research and holds a MSc degree in Probability and Statistics from Faculdade de Ciências da Universidade de Lisboa.She obtained a PhD in Probability and Statistics in 2012. She is Adjunct Professor at Escola Superior de Tecnologia da Saúde de Lisboa (ESTeSL - IPL) and she is the Diretor of the Natural and Exact Sciences Departament in ESTeSL. She teaches Applied Mathematics and Biostatistics there. She is a member of the executive commission of CEAUL (Center of Statistics and Applications). She collaborates with the Genetics and Metabolism Investigation Group of ESTeSL and with the Obesity and Metabolic Syndrome Group of ISAMB (Instituto de Saú&de Ambiental, FMUL). She also participated in the Gulbenkian Partnerships Development Programme in Maputo-Mozambique in January 2015.
Her main research interests are Statistics in Health Sciences, ROC analysis, Molecular Genetics (Microarray Data Analysis) and Bayesian Statistics. She has been a GTPB instructor in the BFB course in 2010, 2011, 2012 and  and the ABSTAT course in 2015.
Affiliation: ESTeSL Escola Superior de Tecnologias da Saúde de Lisboa, Lisboa, PT


Course description

This course is targeted for Biostatistical techniques often employed in analytical tools for high throughput data and multivariate data. Participants can expect to attend a thorough set of lectures that will reveal the conceptual frameworks that are needed to understand the methods. Extensive hands-on practice will be the main vehicle for providing the skills and user independence. To keep things in context, the course is exclusively based on biological examples.
We will be using custom-built R scripts and packages that are available from the CRAN and/or Bioconductor repositories.
Care has been taken not to use any proprietary data or software, so that the hands-on experience can carry on after the course, providing maximum user independence. We will be using custom-built R scripts and packages that are available from the CRAN and Bioconductor repositories.

Methodology

This intensive course will introduce a relatively high number of concepts and methods. To keep it highly practical, we will spend most of the time in hands-on sessions.
- We will focus on each method using examples taken from biological data.
- We will then dissect the method, identifying the concepts and exploring their interrelationships.
- The applicability and limitations of each method will be emphasized.
- The use of the method will be illustrated using appropriate Bioinformatics tools and biological data resources.

Target Audience

Everybody using Bioinformatics methods is implicitly using statistical methods. Moreover, proper judgement of the results often calls for a deeper level of understanding than what is required to solve scholarly exercises.
We will look into particular areas such Simulation, Bayesian Inference, Hidden Markov Chains and Multivariate Data Analysis methods with the attitude, eyes and brains of an experienced statistician that wants to understand how the methods work and systematic way.

Course Pre-requisites

Intermediate level knowledge in Statistics is necessary. There is no time to provide basic knowledge, so we will need to assume that accepted candidates have self-assessed for it in the following areas:
- probability
- conditional probability
- distributions
- statistical tests
- hypothesis testing
- inference

This level can also be obtained by attending another course in GTPB: The IBSTAT course.
Basic Familiarity with the R environment will be necessary. Please follow the exercise that we provide.
Install R from http://cran.r-project.org/ following the instructions.
Download and unzip the Tutorial folder that is made available here.
Then:
- Visualize the slides in "Tutorial R.pdf"
- Follow the exercise in "Basic_Exercise.pdf"
- For reference, we also provide a script with a correct set of R statements in sequence "Tutorial_script.R"

Additionally, we suggest that candidates acquire familiarity with RStudio by visiting the following resources:

- Introduction to RStudio (basics)
- Tutorial R and R Studio (complete)

R Studio will be used in the course to ease-up interaction and increase productivity, but people that prefer the original R environment on the command line will be able to follow that preference.


Application

Detailed Program

Instituto Gulbenkian de Ciência,

Apartado 14, 2781-901 Oeiras, Portugal

GTPB Homepage

IGC Homepage

Last updated:  March 16th 2017