Entry Level Bioinformatics

our entry level course with a soft introduction to NGS data analysis
Downloadable poster in PDF

   IMPORTANT DATES for this Course
   Deadline for applications: March 25th 2020
   Course date: March 30th - April 3rd 2020

Candidates with adequate profile will be accepted in the next 72 hours after the application until we reach 20 participants.


David P Judge is a Computer Scientist who has taught Bioinformatics since 1985. He initiated the University of Cambridge Bioinformatics Training Facilty, providing the necessary environment for graduate and undergraduate courses. He has also been involved with training programmes at the European Bioinformatics Institute, the Wellcome Trust Sanger Institute and the Instituto Gulbenkian de Ciência (IGC) through GTPB. He currently teaches Bioinformatics in several international training programmes and is regularly invited to teach in many places in Europe, Asia, Africa and America. His course notes and exercises are well known to the international community of Bioinformatics professionals and users, many of whom (difficult to count) have had their first contact with Bioinformatics through him. David Judge helped to pioneer Bioinformatics Training at the IGC in 1989, as part of an MSc course. Since its inception in 1999, he has contributed to the GTPB several times per year.

Affiliation: Freelance independent Bioinformatics instructor.
Former manager of Bioinformatics Training, University of Cambridge, Cambridge, UK

Pedro Fernandes graduated in Electronics and Telecommunications Engineering at IST (U.T. Lisboa) in 1979. He worked in Biomedical Engineering, Biophysics and Physiology and changed to Bioinformatics in 1990. He established the first user community in Portugal around the national service provided by the Instituto Gulbenkian de Ciência (IGC) as the portuguese node of EMBnet. In 1998 he created the Gulbenkian Training Programme in Bioinformatics, that provided user skills to more than 5150 course attendees throughout its nineteen years of existance. In 2002, in cooperation with Mario Silva from FCUL, he designed a graduate Programme in Bioinformatics. He currently teaches Bioinformatics both in graduate and undergraduate programmes. Pedro Fernandes is the Training Coordinator of Elixir-PT, now superimposed with Biodata.PT, the national infrastructure for bio-medical data resources. He also represents IGC in GOBLET, the Global Organisation for Bioinformatics Learning, Education and Training.

Affiliation: Instituto Gulbenkian de Ciência, Oeiras, PT

Daniel Sobral graduated in Informatics Engineering from Instituto Superior Técnico (Lisbon, Portugal). His interest in Biology led him to join the Gulbenkian PhD programme, and conduct his doctoral studies in Bioinformatics at the Université Aix-Marseille (France) with Dr. Patrick Lemaire. During his PhD he worked in different aspects of bioinformatics, particularly focusing on gene expression networks underlying embryonic development of a model organism, all of this integrated into a community resource. Later he became a Developer for the Ensembl Project where he worked mostly in integrating epigenetic data from the ENCODE project in Ensembl. In this context he gained significant experience with high throuthput sequencing data. In 2012 he moved back to Portugal, where he joined the Bioinformatics Unit at the IGC to assist the local research community in handling the sequencing revolution brought about by high throughput technologies. Within this role he been collaborating in several projects, ranging from genomics, transcriptomics and epigenetics. In 2020, he moved to to the group of Ana Rita Grosso at UCIBIO FCT-NOVA to focus on projects moving towards building bioinformatic resources with potential practical applicability in biomedecine.

Affiliation: UCIBIO, Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa, Caparica, PT


This is an entry level course in three parts, aimed at those with a reasonable biological background but no significant experience with bioinformatics.

The course content is designed to show how, by going through guided exercises, a researcher in the bio-medical area can access sequence data and steer his/her analysis to efficiently extract results, checking how credible they are and reason about using them confidently, in a reproducible way.

This course will also provide a soft introduction to Next Generation Sequencing (NGS) data analysis. This part of the course aims at providing basic skills that are needed when one needs to process NGS data, using open source bioinformatics tools.

Thirdly, the course will cover the broad theme of analytical automation, and how it enhances the capacity of researchers by widening the scope of their experiments, comparing results in a large scale and summarising the findings in simple but very reproducible ways.


To create an awareness of a wide range of bioinformatics tools and sufficient experience to use those tools in basic investigations with a relatively high degree of user independence.

To show the limitations and ambiguities found in commonly used Bioinformatics resources.

To become aware of specific tools for NGS data analysis (transcriptomes, variant analysis).

To be able to create simple scripts that automate bioinformatics tasks, thus enlarging the scope of action of the tools, applied to large datasets.

Target Audience

Researchers at any level, wishing to investigate how they might begin to exploit the ever expanding abundance of computing and data resources.

Course Pre-requisites

Basic understanding of molecular biology. No particular computing expertise will be assumed.


Detailed Program

Instituto Gulbenkian de Ciência,

Apartado 14, 2781-901 Oeiras, Portugal

GTPB Homepage

IGC Homepage

Last updated:  February 17th 2020