NGSDM11 Next Generation Sequencing Data Management |
IMPORTANT DATES for this Course
|
Instructors:
Matthias_Haimel is working as a software engineer for the Ensembl Genomes Team (led by Dr. Paul Kersey) at the European Bioinformatics Institute in Hinxton, UK. At the Upper Austria University of Applied Sciences in Hagenberg, Austria, he studied Bioinformatics with a focus on Computer Science and graduated in 2006 after working on his diploma thesis at the Friedrich Miescher Institute for Biomedical Research in Basel, Switzerland. Starting at the EBI in 2006, he took over the development and production for the International Protein Index and worked in a team on redeveloping the back-end production framework in the Integr8 project. Since 2008 he has worked on the application of next generation sequence assembly algorithms to a variety of genomes and started developing Curtain, a paired-end assembly pipeline for larger genomes.
|
David Judge is a Computer Scientist that teaches Bioinformatics since 1985. He runs the Bioinformatics Training Facilty housed at the Department of Genetics in the University of Cambridge, providing the necessary environment for graduate and undergraduate courses, on top of a comprehensive training programme in cooperation with the European Bioinformatics Institute, the Wellcome Trust Sanger Institute and the Instituto Gulbenkian de Ciência through GTPB. He teaches Bioinformatics in several international training programmes and is regularly invited to teach in many places in Europe, Asia, Africa and America. His course notes and exercises are well known in the international community of Bioinformatics professionals and users, many of which (difficult to count) have had their first contact with Bioinformatics through him.
|
Note: Matthias and David have prepared a brand new set of course notes and exercises for NGSDM11. |
Course descriptionSince new sequencing technologies have dropped the cost of sequencing entire genomes, the major difficulty has shifted away from the generation of data to handling, processing and assembling large quantities of reads.This course is aimed at providing hands-on skills with tools for assembly and mapping that can handle NGS data. To achieve that, a brief introduction to the methodologies is provided. From the beginning of the course, we will mix hands-on exercises with short presentations. We will explain the use of de Bruijn-based algorithms to enhance assembly results. We will also illustrate the need for using sequence quality to avoid pitfalls in the assembly process. This course will also provide a review of well known de Bruijn-based assemblers. Through exercises using both real and simulated data, we will be able to show their advantages and limitations. The course will provide practical usage skills in a whole set of additional tools for mapping and visualization. Course participants will learn to: - Run velvet assemblies using a variety of reads and how to mix them - Assess the quality and filtering of short-reads - Perform visualisation of assemblies - Create, use and view SAM/BAM files Software that is extensively used: Velvet, Curtain, Cortex, FastQC, bwa, samtools/Picard Software that is visited for illustration: EnsemblGenomes,Tablet, IGV, ABySS-Explorer |
Course Pre requisites: - basic knowledge of working with Linux / Unix command line - basic knowledge of Next Generation Sequencing data - specific interest in de novo genome assembly |
Detailed Program |
Instituto Gulbenkian de Ciência, Apartado 14, 2781-901 Oeiras, Portugal Last updated: March 6th 2011 |