The Gulbenkian Training Programme in Bioinformatics

GTPB runs at the Instituto Gulbenkian de Ciência on a yearly basis since 1999.
More than 950 course attendees in 9 years
Pedro Fernandes, GTPB organizer

GTPB ENSEMBL Courses in 2008

What is ENSEMBL?

The current Ensembl genome browser represents the present state of the art with respect to genome annotation and data management. Current genome annotation includes SNPs and in-dels, as well as protein domains and functional classes such as provided by GO (Gene Ontology). Protein coding gene annotation is provided by the GeneWise program, and SNP-based data mining by the Ensembl BioMart tool.

ENSEMBL08 course website

Deadline for applications: May 6th 2008 (updated April 17th 2008)
Course dates:
ENSGEN08 May 27th and 28th 2008
ENSAPI08 May 29th and 30th 2008


Giulietta Spudich is trained in biochemistry and molecular biology. Her PhD work focused on protein structure and folding, and her postdoctoral research explored protein-protein and protein-lipid interactions. She moved to the world of genomics and bioinformatics in 2006, where she now works for Ensembl Outreach and Training. She gives worldwide courses in exploring genomes with Ensembl and BioMart.

Daniel Rios has a background in Computer Science (Universitat Politecnica de Catalunya), and finished his degree in Sweden (Kungliga Tekniska Hogskolan). He entered in the bioinformatics world developing a tool to allow scientists to curate SNP data submitted to HGVbase, Human Genome Variation database, at the Karolinska Institutet. He joined the Ensembl project in 2004 working in the Functional Genomics team. During his time at EBI, he has been involved in the development of the new database schema and API to support variation data, including data produced with new resequencing technologies, and has taught several API workshops in Europe.

Affiliation: European Bioinformatics Institute, Hinxton, Cambridge, UK

Course description:

ENSGEN08 The Ensembl project provides a comprehensive and integrated source of annotation of mainly vertebrate genome sequences. This two day workshop offers participants the possibility of gaining lots of hands-on experience in the use of the Ensembl genome browser but also provides them with the necessary background information. The workshop is primarily targeted at wetlab researchers. The workshop consists of the following modules. Most modules consist of a presentation, followed by the opportunity to do exercises. Participants are encouraged to bring problems/questions about their research to try to tackle these during the workshop using Ensembl. * Introduction to Ensembl : origin, goals and organization of the Ensembl project * Worked example: guided tour of the most important pages of the Ensembl website * Data mining with BioMart: retrieving datasets using the data mining tool BioMart * Evaluating genes and transcripts: how are Ensembl gene and transcripts predictions made? * Comparative genomics and proteomics: orthologues, protein families, whole genome alignments and syntenic regions * Variations: SNPs, haplotypes, linkage disquilibrium * Advanced access & DAS: uploading your own data, other ways of accessing Ensembl data
ENSAPI08 Ensembl uses MySQL relational databases to store its information. A comprehensive set of Application Programme Interfaces (APIs) serve as a middle-layer between underlying database schemes and more specific application programmes. The APIs aim to encapsulate the database layout by providing efficient high-level access to data tables and isolate applications from data layout changes. This 2-day workshop is aimed at developers interested in exploring Ensembl beyond the website. Participants will be expected to have experience in writing simple Perl scripts and a background in object oriented programming techniques. Being familiar with databases (MySQL) would be an advantage. The workshop covers various Ensembl databases and APIs. For each of them the database schema and the API design as well as its most important objects and their methods will be presented. This will be followed by practical sessions in which the participants can put the learned into practice by writing their own Perl scripts. If there is demand amongst the participants a session on directly querying the core database using SQL queries can also be included. Ensembl Variation databases and API
Technological advances are leading to the widespread availability of multi-species variation data, dense genotype data, and large-scale resequencing projects. The study of human variation has significantly advanced through resources already available from projects such as the HapMap. To address these challenges within Ensembl, we have designed and tested a database solution and API designed to support variation data, dense genotyping and resequencing data from thousands of individual genome sequences.

Course Timetable:

ENSGEN08 Browsing Genes and Genomes with Ensembl
May 27th Ensembl and BioMart
Day 1
09:30 - 11:00 Introduction to genome browsing with Ensembl
11:00 - 11:30 Coffee Break
11:30 - 12:30 Hands-on (browsing) and introduction to data retrieval with BioMart
12:30 - 14:00 Lunch Break
14:00 - 16:00 Hands-on (BioMart) and Comparative Genomics
16:00 - 16:30 Tea Break
16:30 - 18:00 Hands-on (Comparative) and summary
May 28th Advanced Ensembl
Day 2
09:30 - 11:00 The Ensembl gene set
11:00 - 11:30 Coffee Break
11:30 - 12:30 Variations
12:30 - 14:00 Lunch Break
14:00 - 16:00 Tying together Ensembl and BioMart.
Adding your data with DAS
16:00 - 16:30 Tea Break
16:30 - 18:00 Summary and other EBI resources
ENSAPI08 Ensembl API access, Programming workshop
May 29th Hands-on practise with the Ensembl API
Day 1
09:30 - 11:00 What is an API.
Introduction and main objects within the API. The Registry
11:00 - 11:30 Coffee Break
11:30 - 12:30 Using several coordinate systems.
Slices and Features in Ensembl
12:30 - 14:00 Lunch Break
14:00 - 16:00 Genes, Transcripts, Exons and Translations
16:00 - 16:30 Tea Break
16:30 - 18:00 Coordinate transformations.
References to external databases
May 30th Hands-on practise with the Ensembl Variation API
Day 2
09:30 - 11:00 Introduction to variation data, database schema and API.
The Variation object
11:00 - 11:30 Coffee Break
11:30 - 12:30 Population, Individuals and Alleles
12:30 - 14:00 Lunch Break
14:00 - 16:00 VariationFeature and TranscriptVariations
16:00 - 16:30 Tea Break
16:30 - 18:00 New resequencing data: StrainSlice, AlleleFeature and ReadCoverage objects

Pre-requisites: Basic Molecular Biology and Biochemistry. Elementary computing skills.
For the ENSAPI08 course, the applicants should have basic knowledge in PERL scripting.

 GTPB Details

 Course Format

 Logistic Details

 Previous Courses


Instituto Gulbenkian de Ciência, Apartado14, 2781-901 Oeiras, Portugal

GTPB Homepage
IGC Homepage
Last updated:Apr 17th 2008