Advanced Biostatistics for Bioinformatics Tool Users using R
IMPORTANT DATES for this Course
Lisete Sousa is an Associate Professor in Faculty of Sciences of the University of Lisbon (FCUL) and
coordinates the Centre of Statistics and Applications (CEAUL) since January 2017. She obtained her PhD in Probability and Statistics
(Lisbon University and Dortmund University - Germany) in 2004, and since then she has taught the course Statistical Methods in Bioinformatics
in the University of Lisbon, among other statistical courses for BSc. and MSc. Her main research interests are the development of statistical
methodologies for detecting differentially expressed genes, and the prediction of transmembrane proteins topology based on hidden Markov models
but adopting a Bayesian perspective. She coordinated the MSc in Biostatistics at FCUL, from 2009 to 2016, and collaborates with researchers from
the Institute of Molecular Medicine (IMM-FMUL) and from the Center for Biodiversity, Functional and Integrative Genomics (BioFIG-FCUL) in the
development of statistical methods for the analysis of Microarray and RNA-Seq data. She has been a GTPB instructor in the BFB course in 2010 and 2012,
and the ABSTAT course in 2014.
Carina Silva graduated in Statistics and Operational Research and holds a MSc degree in Probability and
Statistics from Faculdade de Ciências da Universidade de Lisboa. She is Adjunct Professor at Escola Superior de
Tecnologia da Saúde de Lisboa (ESTeSL - IPL) and teaches Applied Mathematics and Biostatistics there. She obtained
a PhD in Probability and Statistics in 2012. She has coordinated the course "Statistical Data Analysis with R" at ESTeSL. She is a member
and collaborator of CEAUL (Center of Statistics and Applications). She is the Coordinator of
the scientific area of Mathematics of ESTeSL. Her main research interests are Statistics in Health Sciences, ROC analysis,
Molecular Genetics (Microarray Data Analysis) and Bayesian Statistics. She has been a GTPB instructor in the BFB course in
2010, 2011 and 2012.
This course is targeted for Biostatistical techniques often employed in analytical tools for high throughput data and multivariate data. Participants can expect to attend a thorough set of lectures that will reveal the conceptual frameworks that are needed to understand the methods. Extensive hands-on practice will be the main vehicle for providing the skills and user independence. To keep things in context, the course is exclusively based on biological examples.|
We will be using custom-built R scripts and packages that are available from the CRAN and/or Bioconductor repositories.
Care has been taken not to use any proprietary data or software, so that the hands-on experience can carry on after the course, providing maximum user independence. We will be using custom-built R scripts and packages that are available from the CRAN and Bioconductor repositories.
MethodologyThis intensive course will introduce a relatively high number of concepts and methods. To keep it highly practical, we will spend most of the time in hands-on sessions.
- We will focus on each method using examples taken from biological data.
- We will then dissect the method, identifying the concepts and exploring their interrelationships.
- The applicability and limitations of each method will be emphasized.
- The use of the method will be illustrated using appropriate Bioinformatics tools and biological data resources.
Everybody using Bioinformatics methods is implicitly using statistical methods. Moreover, proper judgement of the results often calls for a deeper level of understanding than what is required to solve scholarly exercises.
Course Pre-requisitesIntermediate level knowledge in Statistics is necessary. There is no time to provide basic knowledge, so we will need to assume that accepted candidates have self-assessed for it in the following areas:
- conditional probability
- statistical tests
- hypothesis testing
This level can also be obtained by attending another course in GTPB: The IBSTAT course.
Basic Familiarity with the R environment will be necessary. Please follow the exercise that we provide.
Install R from http://cran.r-project.org/ following the instructions.
Download and unzip the Tutorial folder that is made available here.
- Visualize the slides in "Tutorial R.pdf"
- Follow the exercise in "Basic_Exercise.pdf"
- For reference, we also provide a script with a correct set of R statements in sequence "Tutorial_script.R"
Additionally, we suggest that candidates acquire familiarity with RStudio by visiting the following resources:
- Introduction to RStudio (basics)
- Tutorial R and R Studio (complete)
R Studio will be used in the course to ease-up interaction and increase productivity, but people that prefer the original R environment on the command line will be able to follow that preference.
Instituto Gulbenkian de Ciência,
Apartado 14, 2781-901 Oeiras, Portugal
Last updated: March 16th 2017