Molecular Epidemiology of Viruses using R

Downloadable poster in PDF

   IMPORTANT DATES for this Course
   Deadline for applications: May 20th 2015 - Extended!
   Latest notification of acceptance: May 21st 2015
   Course date: May 25th - May 28th 2015

Candidates with adequate profile will be accepted in the next 72 hours after the application until we reach 20 participants.


Simon Frost is a computational biologist based at the University of Cambridge, who has worked on the dynamics and evolution of viruses for over twenty years. He has also developed biostatistics and bioinformatics courses, most recently focusing on the use of R.

Affiliation: University of Cambridge, Cambridge, UK

Simon Frost


Rapidly evolving viruses, such as HIV and influenza, lend themselves to phylogenetic analysis, in order to understand the transmission dynamics of the infection, as well as the evolutionary forces that have shaped the pathogen. As with many bioinformatics applications, molecular epidemiological analysis involves many steps, and many data processing and statistical approaches. This course will cover accessing, processing, analysing and visualising viral sequence data, using the R statistical programming language.

The course will illustrate a workflow, going from the extraction of viral sequence data from Genbank, through sequence alignment, exploratory phylogenetic analysis, tree visualisation, and the reconstruction of time-stamped phylogenies. Time permitting, there will also be an exploration of more advanced topics on analysing viral datasets with structure e.g. from different countries or sub-populations. The course will comprise of a series of short overviews on specific topics, followed by a hands-on computer practical. The programming language R will be used to handle data, perform some of the analyses, and act as a 'glue' for running other programs. However, time will be spent on understanding the structure of the workflow by studying example scripts, rather than advanced programming.


The course will provide sufficient skills for the participants to address problems in this area using R scripts, developing their own and exploring freely accessible data resources.

Target Audience

The course was designed for researchers in biology, veterinary science and medicine that have an interest in epidemiology and the dynamics of infectious diseases.

Course Pre-requisites

Elementary computing skills. Minimal skills in creating and modiying R scripts. Background knowledge in bioinformatics is not absolutely necessary, but may be instrumentally useful.

Detailed Program

Instituto Gulbenkian de Ciência,

Apartado 14, 2781-901 Oeiras, Portugal

GTPB Homepage

IGC Homepage

Last updated:  May 18th 2015